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Editorial 

On  Neurons  and  Other  Embarkations 

1 

Lull'd  in  the  countless  chambers  of  the  brain, 
Our  thoughts  are  link'd  by  many  a  hidden  chain. 
Awake  but  one,  and  lo!  what  myriads  rise! 
Each  stamps  its  image  as  the  other  flies. 

Samuel  Rogers 
Pleasures  of  Memory 


Questioners  in  applied  linguistics  often  ask  what  and  why. 
Less  frequently,  questions  of  when  and  who  arise.  This  edition  of 
Issues  in  Applied  Linguistics  is  devoted  to  questions  of  how  and 
where.  How  the  brain  processes,  acquires,  and  uses  language  have 
long  been  inflammatory  questions.  This  flammability  seems  to 
hinge  more  upon  the  qualities  of  language  than  upon  whether 
language  is,  need  be,  or  ought  to  be  reducible  to  its  mechanistic 
underpinnings.  While  the  latter  methodological  issues  are  indeed 
controversial,  without  asking  and  answering  the  former  questions 
they  seem  moot  points.  It  is  difficult  to  know  how  to  answer  those 
who  feel  language  can  be  significantly  studied  wholly  independent 
from  the  brain,  since  brains  seem  to  be  at  least  minimally  essential 
for  its  genesis.  Once  extant,  however,  why  is  there  a  need  to  study 
the  brain  at  all?  Can't  we  merely  study  linguistic  processes  per  se, 
assuming  that  as  we  do  so  we  will  be  illuminating  the  mind? 

The  writers  published  here,  I  believe,  and  certainly  the 
editors,  are  of  the  opinion  that  mind  is  brain  for  most  meaningful 
situations;  that  is,  that  the  theoretical  construct  mind  has  little  value 
in  the  formulation  of  hypotheses  about  language  and  its  mechanistic 
underpinnings.  The  emphasis  throughout  this  issue  is,  therefore,  on 
alternatives  to  current  neurobiological,  psychological,  and 
philosophical  views  on  language  and  brain. 

Most  of  the  research  on  brain  and  language  has  involved 
neurolinguistic  aphasiology,  the  study  of  language  abnormalities  in 

Issues  in  Applied  Linguistics  ISSN  1050-4273 

©  Regents  of  the  University  of  California  Vol.  3  No.  2  1992  203-208 


204   Editorial 

adults  whose  language  faculties  have  been  disrupted  as  a  result  of 
brain  injury  of  one  kind  or  another.  This  area  of  research  has  a 
crucial  place  in  history  and  an  assured  future.  Indeed,  most  of  what 
is  currently  known  about  language  in  the  brain  comes  from  studies 
of  this  type.  Since  neural  pathology  will  be  a  medical  fact  of  life  for 
some  time  to  come,  it  seems  only  reasonable  to  expect  this  fruitful 
type  of  research  to  continue.  However,  there  is  also  a  need  for  a 
different  kind  of  brain  research  to  begin,  a  kind  which  focuses  on 
the  mechanisms  underlying  the  processes  of  acquisition, 
memorization,  and  forgetting.  To  this  end,  John  Schumann  and 
several  of  his  students  have  established  the  UCLA  Neurobiology  of 
Language  Research  Group  (NLRG). 

The  technique  of  this  group  is  not  necessarily  "empirical," 
although  activity  of  this  type  is  certainly  welcome.  Due  to  an 
acknowledged  need  in  neuroscience  for  theoretical  development  on  a 
comprehensive  scale,  a  generation  of  "Platonic"  scientists  whose 
sole  function  is  to  review,  assimilate,  and  apply  the  work  of 
laboratory  scientistsis  is  being  developed.  The  field  is  frequently 
referred  to  as  Theoretical,  or  Speculative,  Neuroscience.  The 
students  of  the  NLRG  should  be  seen  within  this  interdisciplinary 
light. 

Implicit  in  all  that  this  group  discusses  within  these  pages  is 
the  awareness  that  what  is  being  studied  is  the  process  of  how  the 
brain  learns  and  remembers.  To  this  end,  it  is  frequently  necessary 
to  voyage  into  areas  which  are  ostensibly  far  afield  from  traditional 
linguistics,  areas  like  neurochemistry  and  neurophysiology. 


Clay  is  moulded  to  make  a  vessel,  but  the 
utility  of  the  vessel  lies  in  the  space  where  there 
is  nothing.  .  .  .  Thus,  taking  advantage  of 
what  is,  we  recognise  the  utility  of  what  is  not.    . 

Lao  Tzu 
Tao  Te  Ching 

The  research  interests  of  the  authors  represented  in  this  issue 
of  lAL  are  various.  What  they  have  in  common  is  a  belief  that  study 
of  the  brain  can  not  only  shed  light  on  current  issues  in  the  study  of 
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language,  it  will  create  as  yet  undiscovered  areas  of  inquiry.  The 
aim  of  neurobiological  reductionism  is  not  the  mere  charting  of  the 
mechanisms  underlying  processes  linguists  have  already  posited, 
but  the  recognition  of  the  spaces  "where  there  is  nothing,"  spaces 
we  would  not  otherwise  realize.  With  this  goal  in  mind,  there  is  no 
way  that  reductionistic,  materialistic  study  of  language  and  its 
relationship  with  the  brain  can  be  called  "naive." 

The  first  article,  by  Larry  Lem,  introduces  us  to  one  of  the 
key  issues  facing  the  brain-language  discipline:  the  potential  which 
linguistically  "non-traditional"  areas  of  the  brain  have  for  informing 
continued  research.  Starting  with  an  explanation  of  how  much  of 
what  is  currently  known  about  the  relationship  between  brain  and 
language  is  the  result  of  the  historical  predominance  of  a  few  critical 
areas  for  language  comprehension  and  production,  Lem  goes  on  to 
posit  alternative  brain  areas  as  important  to  language.  This  essay  is 
of  critical  importance  because  it  not  only  expands  the 
neurobiological  field  of  inquiry,  but  also  forces  the  issue  of  how  we 
must  look  at  the  brain  as  a  whole  if  we  want  to  find  a  language 
acquisition  device. 

Much  of  what  we  learn  about  language  is  not  of  the  type 
about  which  learners  can  talk.  There  are  some  forms  of  linguistic 
knowledge  which  are,  clearly,  implicit.  How  different  or  specific 
types  of  linguistic  knowledge  might  be  differentially  represented, 
acquired,  and  stored  within  the  brain  is  the  topic  of  our  second  main 
article.  Scarlett  Robbins,  the  assistant  editor  of  this  journal,  has 
written  a  piece  in  which  she  demonstrates  how  procedural 
knowledge  might  be  housed  in  the  brain. 

The  third  main  article,  by  Edynn  Sato  and  Bob  Jacobs, 
focuses  on  the  brain  mechanisms  necessary  for  the  processes  of 
selective  attention.  The  necessity  of  this  type  of  activity  for  the 
acquisition  of  language  is  obvious.  How  the  brain  goes  about  this 
activity  is  a  topic  which  the  authors  emphasize,  clearly  showing  how 
neurobiological  analysis  can  be  the  a  priori  approach  to  the  study  of 
language  acquisition  mechanisms. 

An  often  overlooked  area  of  the  study  of  language  is  how  we 
go  about  forgetting  it.  Asako  Yoshitomi  submits  a  piece  showing 
how  the  analysis  of  how-we-forget  is  just  as  crucial  to  our 
understanding  of  the  representation  of  language  in  the  brain  as  the 
study  of  how-we-remember.  She  gives  us  considerable  evidence 
from  psychological  and  neurobiological  studies  supporting  her 
model  of  language  attrition. 
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In  our  last  issue,  we  published  an  article  by  Yasuhiro  Shirai 
on  Connectionism,  a  methodological  and  conceptual  schema  through 
which  language  transfer  could  be  explained.  In  our  Exchange 
section,  Cheryl  Fantuzzi  criticizes  many  of  the  premises  of  Shirai's 
arguments  and  questions  cognitive  modeling  in  general. 

We  are  fortunate  to  have  four  reviews  to  offer  in  this  issue. 
Eduardo  Faingold  reviews  a  book  called  "Why  More  English 
Instruction  Won't  Mean  Better  Grammar."  Charlene  Polio 
comments  upon  the  state  of  the  field  of  applied  linguistics  in  her 
review  of  Larsen-Freeman  &  Long's  Introduction  to  Second 
Language  Acquisition.  Jim  Purpura,  the  advertising  manager  of 
lAL,  reviews  Ellen  Bialystok's  book,  Communication  Strategies:  A 
Psychological  Analysis  of  Second-Language  Use,  which  posits  a 
new  psychological  framework  to  account  for  language  learners' 
production  strategies.  And  Howard  Williams  critiques  the  second 
volume  of  Talmy  Givon's  recent  work  on  morphology  and  syntax. 


In  creating,  the  only  hard  thing's  to  begin; 

A  grass-blade's  no  easier  to  make  than  an  oak. 

James  Russell  Lowell 
A  Fable  for  Critics 

n  n'y  a  que  le  premier  pas  qui  coute. 
[It  is  only  the  first  step  which  takes  the  effort.] 
(This  quote  refers  to  the  legend  of  Saint  Denis  who 
walked  away  from  his  own  execution  canying  his 
head) 

Madame  Marie  Vichy-Deffand 
Lettre  d  d'Alembert 


A  little  over  a  year  ago,  I  began  working  on  this  journal  in 
the  capacity  of  production  assistant.  It  was  the  type  of  beginning 
which,  at  the  time,  seemed  reasonably  sound  and  safe:  a  little  work, 
learn  the  ropes,  and,  with  luck  and  years  of  dedication,  be  allowed  a 
position  among  the  senior  editors.  Little  did  I  know.  This 
publication  is,  I  need  not  remind  the  reader,  a  production  of  graduate 
students.  As  students'  lives  progress,  they  frequently  feel  the  need 
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for  certain  changes  as  a  result  of  minor  hurdles  inherent  in  the  path 
to  academic  advancement.  As  a  result,  within  my  first  year  at  lAL  I 
moved  (rather  quickly)  through  the  position  of  managing  editor  to 
the  position  I  now  have  the  very  real  honor  of  holding.  I  thank  the 
former  editor,  Sally  Jacoby,  and  the  former  managing  editor,  Patrick 
Gonzales,  for  their  support  and  mentorship.  I  also  thank  those  who 
judged  me  capable  of  assuming  this  position,  including  Marianne 
Celce-Murcia  and  John  Schumann,  whose  continued  support  will 
never  be  forgotten. 

This  third  issue  with  which  I  am  involved  happens,  quite 
coincidentally  I  assure  you,  to  be  one  which  is  quite  close  to  my 
heart.  As  a  member  of  the  NLRG,  I  have  been  intimately  acquainted 
with  the  development  of  this  particular  issue.  The  idea  of  a  second 
guest-edited  issue  occurred  as  a  result  of  our  first  guest-edited  issue 
of  one  year  ago.  The  response  to  that  issue  was  (and  continues  to 
be)  so  enthusiastic  that  we  have  decided  to  continue  the  practice  of 
thematically  focused  issues.  John  Schumann  mentioned  to  Sally 
Jacoby  that  there  was  a  body  of  papers  written  for  his  courses  that 
might  make  an  interesting  journal  issue.  One  year  later,  here  we  are. 

Also,  this  issue  of  lAL  is  the  first  in  which  none  of  the 
current  editorial  staff  or  their  assistants  can  say  they  were  around 
when  the  idea  of  founding  a  journal  focusing  on  the  important 
arguments  in  our  field  was  conceived.  We  new  folk  hope  that  in  the 
ensuing  years  we  may  continue  to  foster  the  idea,  to  continue  to 
uphold  the  commitment  to  excellence  of  the  material  which  goes 
between  these  covers,  and  to  institutionalize  Issues  in  Applied 
Linguistics  as  one  of  the  most  eminent  journals  in  the  field.  To  this 
latter  end,  I  am  pleased  to  announce  a  European  distributor  for  the 
journal,  subscriptions  on  six  continents,  and  the  assemblage  of  a 
new,  enthusiastic  editorial  board  with  a  great  many  talented  people 
interested  in  assisting.  The  future  of  this  journal  looks  very  bright 
indeed. 

I  would  like  to  thank  the  readers  of  the  articles  for  this  issue 
who  gave  of  their  time  and  expertise  to  help  make  these  papers 
accurate  and  interesting.  I  would  also  like  to  thank  all  those  who 
helped  to  make  this  issue  a  reality,  including  the  other  members  of 
the  editorial  committee  and  the  production  assistants.  Most  of  all  I 
would  like  to  thank  the  guest  editor,  John  Schumann,  without 
whose  helpful  guidance  and  stalwart  vision  this  thematic  issue 
would  not  be. 

December,  1992  Joseph  R.  Plummer 
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Guest  Editorial 

Exploring  Neurobiology  of  Language 


John  Schumann 

University  of  California,  Los  Angeles 


Since  the  mid  1980's,  there  has  been  a  growing  interest  in 
the  cognition  underlying  Second  Language  Acquisition  (SLA). 
The  typical  procedure  for  discovering  cognitive  processes  is  to 
study  interlanguage  behavior  and,  on  the  basis  of  the  patterns 
observed,  to  infer  causal  cognitive  mechanisms  and  processes. 
This  mode  of  research  has  produced  an  interesting  set  of  constructs 
that  includes  buffers,  filters,  analyzers,  formulators, 
conceptualizers,  articulators,  interpreters,  monitors,  as  well  as 
pidginization,  nativization,  generalization,  simplification,  transfer, 
etc.  But  there  may  be  another  way  to  understand  the  mechanisms 
and  processes  that  are  responsible  for  SLA,  and  that  is  by  relating 
SLA  to  the  biological  organ  responsible  for  it,  the  brain.  As 
Bechtel  (1992)  indicates,  constructs  such  as  those  above  may 
constitute  mechanisms  to  explain  phenomena  observed  at  the 
behavioral  level.  But  these  mechanisms  themselves  become 
phenomena  to  be  given  mechanistic  explanations  at  the 
neurobiological  level.  The  reduction  often  leads  to  new 
understanding  and  perhaps  reformulation  at  the  higher  level. 

At  UCLA,  one  of  our  doctoral  students.  Bob  Jacobs,  began 
studying  neurobiology  in  1985.  Bob's  interest  in  the  relationship 
between  applied  linguistic  issues  and  neurobiology  led  to  a  series 
of  informal  meetings  during  the  fall  of  1986  between  Professor 
Arnold  Scheibel  (Directory  of  the  UCLA  Brain  Research  Institute), 
Professor  Wolfgang  Klein  (Director  of  the  Max  Planck  Institute  in 
Holland,  a  visiting  professor  at  the  time),  Bob,  and  myself.  Using 
Bob's  work  (i.e.,  Jacobs,  1988)  as  a  basis  for  discussion,  we 
explored  a  range  of  issues  concerning  neurobiology  and  second 
language  acquisition.  These  discussions  led  to  a  plan  to  institute  a 
neurobiology  course  in  the  Applied  Linguistic  Program  in  the 
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autumn  of  1987.  The  course  was  taught  by  Dr.  Scheibel  and  Bob 
that  year  and  again  by  Dr.  Scheibel  in  1989  and  1991.  This  has 
resulted  in  about  30  students  who  now  have  a  background  in 
neurobiology  to  which  they  can  refer  in  their  studies  of  language 
acquisition,  assessment,  and  use.  Our  plan  is  to  continue  to  offer 
the  course  every  other  year  as  part  of  our  department's  program  to 
develop  a  laboratory  system  for  doctoral  training  (see  Celce- 
Murcia,  1992).  In  this  regard,  we  have  established  the 
Neurobiology  of  Language  Research  Group  (NLRG). 

With  a  knowledge  of  basic  neuroanatomy  and 
neurophysiology  there  are  several  approaches  to  the  study  of 
cognition  in  SLA.  One  can  relate  interlanguage  behavior  to  parts 
of  the  brain  that  are  known  to  generate  similar  behaviors.  One  can 
also  take  the  mechanisms  and  processes  that  have  been  inferred 
from  SLA  studies  and  match  them  with  neural  structures  and 
functions  known  to  operate  in  similar  ways.  Finally,  one  can  start 
with  areas  of  the  brain  responsible  for  perception,  stimulus 
appraisal,  emotion,  attention,  and  memory,  study  how  they  operate, 
and  relate  them  to  SLA.  In  this  way,  neurobiology  can  provide  a 
new  perspective  on  the  discipline  (see  Jacobs  and  Schumann, 
1992). 

However,  as  the  members  of  our  department's  NLRG  have 
been  quick  to  learn,  there  is  some  reluctance  in  the  field  to  the  view 
that  cognitive  processes  are  neural  processes  and  that  the  study  of 
the  brain  can  inform  the  study  of  multilingualism.  For  example, 
my  earlier  work  on  SLA,  the  pidginization/acculturation  model, 
was  a  social-psychological  account  of  SLA  that  was  often 
criticized  for  lacking  a  cognitive  component.  At  a  recent 
conference  I  presented  a  neurobiological  perspective  on  cognition 
in  SLA  and  afterwards  I  asked  a  colleague  who  had  been  urging 
me  for  years  to  add  cognition  to  the  model  what  she  thought.  My 
friend  replied,  "I  said  cognitive,  not  neuro!"  I  asked,  "Well,  where 
do  you  think  cognition  takes  place?"  Being  astute  in  such  matters, 
she  replied,  "If  I  don't  say  the  brain,  you'll  call  me  a  dualist."  She 
was  right.  Long  (1992)  has  raised  the  issue  of  whether  it  is  the 
applied  linguist's  job  to  provide  hypotheses  about  the 
neurophysiology  of  language  acquisition.  I  would  argue  that  it  is 
certainly  not  necessary  for  applied  linguists  to  undertake  such 
tasks,  but  if  they  were  to  acquire  the  requisite  knowledge,  there  is 
no  reason  why  they  should  not  theorize  about  the  neurobiology  of 
language.  If  the  enterprise  becomes  a  productive  research 
paradigm  that  furthers  our  understanding  of  language  acquisition 
and  use,  it  may  then  be  necessary  for  applied  linguists  routinely  to 
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learn  neurobiology  in  order  to  understand  and  evaluate  the 
explanations  that  are  generated. 

Thinking  about  neurobiology  and  language  has  been 
constrained  by  the  demand  that  neurobiological  accounts  address 
specific,  current  issues  in  linguistic  theory,  such  as  subjacency  or 
the  empty  category  principle.  When  neurobiologists  interested  in 
language  are  unable  to  do  this,  linguists  usually  assert  that  the 
problem  is  that  not  enough  is  known  about  the  brain.  However, 
there  is  a  great  deal  known  about  the  brain,  and  the  problem  may 
be  that  current  linguistic  theory  is  so  far  off  the  mark  or  so 
abstractly  formulated  that  it  defies  a  neurobiological  account. 
Before  a  neural  explanation  of  linguistic  theory  is  possible,  we  may 
need  a  much  more  sophisticated  formulation  of  the  theory. 

In  the  articles  in  this  special  issue  on  the  brain  and 
language,  we  have  deliberately  ignored  the  constraints  referred  to 
above.  Instead,  we  have  approached  the  topic  with  a  very  broad 
view  of  language.  We  examine  cognitive  processes  that  affect 
SLA  generally  (e.g.,  attention),  cognitive  processes  involved  in 
language  production  (e.g.,  procedural  knowledge),  memory 
systems  affecting  language  loss  (e.g.,  intermediate  memory),  and 
areas  of  the  brain  involved  in  lexical  knowledge  (e.g.,  the  fusiform 
cortex). 

In  these  articles,  we  have  not  avoided  using  the  relevant 
neurobiological  terms.  We  feel  that  expressions  such  as  "the 
attention  areas  of  the  brain"  or  "the  memory  centers  of  the  brain," 
while  sparing  the  reader  technical  vocabulary,  actually  are  a 
disservice  because  they  are  imprecise  and  do  not  allow  the  reader 
to  know  whether  the  authors  are  referring  to  the  same  or  different 
parts  of  the  anatomy.  The  articles,  therefore,  may  not  be  easy 
reading  for  someone  who  is  totally  naive  to  the  nervous  system,  but 
a  little  study  of  the  included  definitions,  sketches  and  diagrams, 
plus  a  careful  re-reading,  should  be  sufficient  for  anyone  who 
really  wants  to  grasp  the  material. 
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Beyond  Broca's  and  Wernicke's  Areas: 

A  New  Perspective  on  the  Neurobiology  of  Language 
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Brain-based  discussion  of  language  has  classically  centered  around 
models  focused  on  Broca's  and  Wernicke's  areas.  Recent  neurobiological 
research  indicates  that  such  models  may  be  oversimplified.  The  present  paper 
attempts  to  propose  a  model  in  which  afar  greater  number  of  brain  structures 
are  involved  in  language  functions.  To  demonstrate  this  model,  three  areas  of 
the  brain  rarely  associated  with  language,  the  anterior  cingulate  gyrus,  the 
prefrontal  cortex,  and  the  basal  temporal  language  area  (fusiform  gyrus)  are 
examined.  Recent  neurobiological  research  linking  these  areas  to  language 
function  will  be  reviewed  to  illustrate  that  a  whole-brain  view  of  language  is 
both  more  feasible  and  better  supported  by  data  than  the  idea  of  a  language- 
specific  brain  system,  such  as  the  Wernicke-Geschwind  model. 


INTRODUCTION 


Localization  theory,  which  advocates  that  various  abilities 
are  mapped  to  specific  anatomical  structures  in  the  brain,  was  first 
proposed  in  the  early  nineteenth  century  by  Gall  and  Spurzheim 
(1810).  This  theory  was  applied  to  language  abilities  in  1861 
when  Paul  Broca  attributed  speech  to  a  frontal  portion  of  the 
brain's  left  hemisphere,  an  area  that  became  known  as  Broca's 
area.  About  ten  years  later,  Karl  Wernicke  correlated  a  deficit  in 
comprehension  and  semantic  meaning  with  brain  damage  in  an 
area  of  the  left  temporal  lobe  now  known  as  Wernicke's  area. 
Wernicke  composed  a  theory  of  language  based  on  interaction 
between  various  sensory  areas  (such  as  auditory  areas  for  hearing 
language)    in    the    brain    with    Wernicke's    and  Broca's  areas 
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during  language  perception  and  production  (Mayeux  &  Kandel, 
1991).  Study  of  the  brain  and  language  has  continued  to  focus 
heavily  on  these  two  areas  and  has  produced  language  theories 
based  on  the  Wernicke  model  (Geschwind,  1965;  Penfield  & 
Roberts,  1959).  In  this  model,  the  process  of  naming  a  visually 
sighted  object  begins  by  the  transfer  of  information  from  the  eyes 
to  the  visual  cortex.  From  there  the  information  is  taken  to  the 
angular  gyrus  and  then  to  the  adjacent  Wernicke's  area.  Here  the 
visual  information  becomes  a  phonetic  representation.  This 
representation  is  sent  through  the 
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UNDERSIDE  OF  THE  BRAIN 
Figure  1.  The  diagram  of  the  right  side  of  the  brain  shows  a  hatched 
line  that  follows  the  fusiform  gyrus  down  the  underside  of 
the  brain.  The  gyrus  is  pointed  out  by  arrows  on  the 
diagram  of  the  left  brain.  The  shaded  area  indicates  the 
region  of  the  fusiform  gyrus  known  as  the  basal  temporal 
language  area. 
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arcuate  fasciculus  to  Broca's  area,  where  it  is  conveyed  to  the 
motor  cortex  to  initiate  articulation  (Mayeux  &  Kandel,  1991). 

Although  Wernicke's  model  and  subsequent  models  have 
contributed  to  our  current  understanding  of  brain  lesions  and 
language  deficits,  they  do  not  necessarily  explain  the  production 
of  language  in  normal  individuals.  While  the  deficits  can  be 
correlated  with  damaged  anatomical  structures,  the  size  and 
precise  location  of  lesions  is  not  exactly  the  same  in  all  cases. 
Thus,  corresponding  generalizations  to  language  production  may 
be  oversimplified.  The  model  indicates  that  two  areas  of  the 
cortex  specialized  for  language,  along  with  a  few  accessory  areas, 
are  involved  in  language  production.  Recent  neurobiological 
research  indicates  that  a  new  view  of  language  may  be 
hypothesized,  one  far  more  complex  than  the  traditional 
Wernicke-Geschwind  model  involving  Broca's  and  Wernicke's 
areas  along  with  a  few  connecting  structures  (Geschwind,  1965; 
Damasio  &  Geschwind,  1984).  A  more  plausible  model  would 
incorporate  a  larger  number  of  brain  systems  into  language,  rather 
than  simply  one  language-specialized  system. 

Recent  technological  developments  have  led  to  new 
methods  of  investigation  that  are  non-invasive  and  do  not  require 
a  language-deficient  subject.  One  such  method.  Positron 
Emission  Tomography  (PET),  is  a  brain  imaging  technique  that 
allows  researchers  to  make  correlations  between  function  and 
anatomy  (Phelps,  1991a).  PET  determines  the  metabolic  rate  and 
the  relative  amount  of  blood  flowing  to  a  particular  brain  region, 
both  of  which  are  measures  of  the  relative  activity  of  that 
structure.  This  technique  has  allowed  researchers  to  determine 
which  parts  of  the  brain  are  active  during  normal  functions,  such 
as  attention,  audition,  or  eye  movement  (Phelps,  1991a).  Because 
of  the  increasingly  high  resolution  of  the  imaging,  the  activity  of  a 
number  of  structures  located  throughout  the  brain  may  be 
simultaneously  observed  (Phelps,  1991b). 

Use  of  PET,  coupled  with  results  from  excision  and 
electrical  stimulation  studies  ^  has  allowed  researchers  to  identify 
whole  neural  systems  involved  in  brain  activities,  such  as 
language  (production  and  comprehension).  The  resulting  data 
indicate  that  numerous  processes  are  involved  in  language 
production  and  acquisition  and  that  each  of  these  processes 
involves  a  variety  of  distinct  yet  interconnected  structures.  For 
example,  for  a  second  language  learner  to  generate  a  sentence 
fluently  requires  the  long  term  memory  storage  and  retrieval  of 
vocabulary,  as  well  as  internalized  knowledge  regarding  the 
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application  (and  operationalization)  of  grammatical  rules.  A 
process  known  as  long  term  potentiation,  originating  in  the 
hippocampus  and  related  structures,  seems  to  lead  to  the  memory 
needed  for  vocabulary  storage.  In  addition,  rapid  retrieval  and  use 
of  grammatical  knowledge  may  be  viewed  as  a  procedural  skill, 
thus  involving  the  procedural  knowledge  memory  system  of  the 
cerebellum  in  language  (see  Robbins,  this  volume). 

Language  is  not  a  function  that  the  cerebellum  was 
expected  to  have  and  yet  the  data  seems  to  indicate  that  it  does 
have  a  role.  Previous  reports  on  the  functions  of  the  cerebellum 
focussed  on  the  motor  coordination  role  of  that  structure. 
Likewise,  recent  neurobiological  data  seem  to  implicate  three 
areas  previously  unsuspected  of  playing  a  role  in  language.  The 
fusiform  gyrus  appears  on  the  underside  of  the  brain  and  may  not 
have  been  accessible  to  surgeons  for  lesion  studies  in  times  past. 
Known  for  its  role  as  an  association  area,  the  anterior  cingulate 
has  previously  been  assigned  the  function  of  relaying  sensory 
information  to  other  areas  of  the  cortex.  The  prefrontal  cortex  has 
long  been  associated  with  cognitive  functions  and  planning,  but 
never  with  language.  Linking  this  region  with  language,indicates 
that  not  only  this  region,  but  any  that  play  a  role  in  the  cognitive 
functioning  of  humans,  may  be  involved  in  language.  Because 
each  of  these  regions  is  also  known  to  play  a  role  thought  to  be 
unrelated  to  language,  they  are  excellent  candidates  for 
demonstrating  the  whole-brain  perspective  of  language. 


ANATOMY  OF  THE  FUSIFORM  GYRUS 


The  fusiform  gyrus  (also  known  as  the  lateral 
occipitotemporal  gyrus)  is  a  longitudinal  fold  of  cortex  located  on 
the  underside  of  the  temporal  lobe.  Connections  of  this  particular 
gyrus  are  not  well  established.  A  known  link  occurs  between  the 
fusiform  gyrus  and  the  area  near  the  angular  gyrus,  an  area 
adjacent  to  the  traditional  Wernicke's  area  (Bogousslavsky, 
Miklossy,  Deruaz,  Assal,  &  Regli,  1987).  The  basal  temporal 
language  (BTL)  area,  the  focus  of  discussion  here,  is  a  small 
region  located  on  this  gyrus  (see  Figure  1). 
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Basal  Temporal  Language  Area 


Wernicke's  area  is  generally  accepted  as  the  location 
where  processing  for  language  comprehension  occurs.  Electrical 
stimulation  studies  done  by  Luders  and  his  colleagues  have  shown 
that  the  anterior  portion  of  the  fusiform  gyrus  also  plays  some  role 
in  language  comprehension.  They  have  named  this  small  section 
the  basal  temporal  language  area  (BTL). 

Luders  and  colleagues  originally  located  the  area  through  a 
case  study  and  then  verified  his  findings  in  a  more  extensive 
study.  In  the  case  study,  a  man  with  intractable  complex  partial 
seizures  was  evaluated  in  preparation  for  surgical  treatment  of 
epilepsy  (Luders,  Lesser,  Hahn,  Dinner,  Morris,  Resor,  & 
Harrison,  1986).  The  man  tested  normal  for  intelligence  on  the 
Wechsler  Adult  Intelligence  Scale^  and  displayed  no  physical 
ailments  other  than  epilepsy.  A  Wada  test^  determined  that  he 
was  left  hemisphere  dominant  for  the  computational  aspects  of 
language. 

Arrays  of  subdural  electrodes  were  surgically  implanted 
over  the  lateral  and  basal  surfaces  of  the  temporal  lobe  in  the  left 
hemisphere  to  identify  the  position  of  the  epileptogenic  focus  in 
the  left  temporal  lobe.  Stimulation  of  the  electrodes  in  the  lateral 
temporal  region  over  Wernicke's  area  produced  speech 
interference,  defined  as  an  inability  to  read  a  text  aloud.  Writing 
was  also  inhibited.  This  same  type  of  speech  deficit  was  observed 
when  the  basal  temporal  region  was  stimulated.  The  patient  was 
unable  to  comprehend  either  spoken  and  written  language  and 
could  not  repeat  words  spoken  to  him.  Moreover,  the  patient  could 
not  write  words  he  had  been  instructed  to  write  prior  to  the  start  of 
stimulation.  Essentially,  a  global  aphasia  was  produced  by 
stimulation  of  BTL.  Luders  and  colleagues  determined  the 
aphasia  to  be  a  language  specific  interference  by  eliminating  the 
other  possible  causes  for  speech  arrest.  The  possibility  of 
stimulation  producing  a  seizure  was  eliminated,  as  no  other 
clinical  signs  of  a  seizure  were  evident  and  none  of  the 
widespread  electrical  activity  that  usually  accompanies  a  seizure 
was  detected.  The  speech  arrest  was  not  due  to  a  negative  motor 
effect  (motor  inhibition  due  to  stimulation),  as  the  patient  was  still 
able  to  produce  rapid  alternating  movements  in  his  hands  and  lips. 
Nor  did  the  stimulation  produce  a  general  processing  interruption, 
as  the  patient  was  still  able  to  memorize  complex  geometric 
designs  and  draw  them  from  memory  during  stimulation.  Luders 
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and  colleagues  concluded  that  they  had  happened  upon  a  true 
language  area. 

Luders  and  his  colleagues  then  performed  a  more 
extensive  electrical  stimulation  study  of  22  epileptic  patients 
(Luders,  Lesser,  Hahn,  Dinner,  Morris,  Wyllie  &  Godoy,  1991). 
In  eight  of  the  22  patients,  a  basal  temporal  language  area  was 
identified,  compared  to  15  of  22  for  Broca's  area  and  14  of  22  for 
Wernicke's  area.  In  the  eight  patients  who  exhibited  a  basal 
temporal  language  area,  the  language  deficit  was  elicited  with 
stimulation  only  in  the  dominant  hemisphere,  never  in  the 
nondominant  temporal  lobe.  Moreover,  the  degree  of  language 
interference,  as  measured  by  the  ability  to  read  aloud,  was  found 
to  increase  with  the  strength  of  the  stimulation.  Three  of  the  8 
patients  were  tested  in  more  detail  to  determine  the  extent  of  their 
aphasia.  Two  of  the  patients  could  not  repeat  words  during 
stimulation;  the  third  was  able  to  communicate  only  by  gestures. 
Verbal  comprehension  as  tested  by  the  Token  test  (following 
simple  one  and  two-step  commands)  was  inhibited  by  stimulation 
to  some  degree  in  all  three.  All  three  patients  were  also  unable  to 
name  objects  presented  to  them  during  stimulation.  The  patients 
all  had  severe  agraphia  (i.e.,  inability  to  write)  during  stimulation 
as  well.  As  before,  the  language  interference  could  best  be 
characterized  as  a  global  aphasia  in  all  the  patients. 

To  verify  the  specificity  of  the  language  interference, 
several  other  functional  tests  were  conducted.  Using  Koh's  block 
tests  (Stone,  1985)  to  examine  intellectual  function  in  two 
patients,  Luders  and  colleagues  found  that  they  were  able  to 
perform  relatively  complex  tasks  without  any  sign  of  inhibitions. 
Motor  activity  was  determined  to  be  unaffected  because  rapid 
alternating  tongue  movement  was  possible  (no  negative  motor 
effect).  Patients  also  performed  complex  nonverbal  tasks  without 
difficulty  and  facial  recognition  posed  no  problems  for  the 
patients  either.  The  occurrence  of  circumlocutions  and  the  ability 
to  communicate  with  gestures  points  to  intact  mental  functioning, 
ruling  out  a  disturbance  of  consciousness.  These  tests  suggest  that 
the  language  inhibition  was  not  due  to  intellectual  dysfunction, 
motor  inhibition,  or  epileptiform  disturbances  of  consciousness 

Luders  and  colleagues  were  able  to  identify  the  exact 
location  of  the  basal  temporal  language  area  by  controlling  which 
electrode  in  the  implanted  array  was  used  during  the  stimulation. 
X-rays  and  surgical  inspection  confirmed  the  positions  of  the 
electrode.  The  basal  temporal  area  began  about  3  to  3.5  cm  from 
the  anterior  tip  of  the  temporal  lobe  in  all  cases  and  varied  at  the 
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posterior  end  of  the  fusiform  gyrus  between  4  to  7  cm  from  the 
temporal  pole  (see  Figure  2).  The  variation  in  size  of  the  basal 
temporal  area  was  not  addressed  by  Luders  and  colleagues  in 
reference  to  determining  language  loss. 
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LEFT  HEMISPHERE  OF  THE  BRAIN 


Figure  2.  Diagram  of  the  lateral  (outer)  side  of  the  left  side  of  the 
brain.  The  arcuate  fasciculus  is  a  bundle  of  fibers  that 
extends  from  the  temporal  lobe,  near  Wernicke's  area, 
curves  around  the  angular  gyrus  and  travels  forward  to  the 
prefrontal  cortex,  near  Broca's  area.  The  prefrontal  cortex 
is  indicated  as  Brodmann's  areas  9,  10,  11,  and  46. 


Three  of  the  patients  required  removal  of  the  basal 
temporal  language  area  to  treat  their  epilepsy  and  in  two  cases,  no 
verbal  deficit  was  detected  postoperatively.  In  the  third  case,  a 
slight  temporary  aphasia  occurred  that  disappeared  six  months 
postoperatively. 
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THE  FUSIFORM  GYRUS  (BTL  AREA)  IN  LANGUAGE 


The  presence  of  a  third  language  area  is  a  surprising 
development  that  certainly  requires  modification  of  the  Wernicke- 
Geschwind  model  for  language  processing,  particulariy  because 
stimulation  in  the  basal  temporal  area  and  Wernicke's  area 
produced  similar  types  of  effects.  A  negative  motor  area  (region 
of  the  brain  which,  when  stimulated,  causes  inhibition  of  motor 
activity)  overlaps  parts  of  Broca's  area  and  stimulation  in  non- 
motor  regions  within  Broca's  area  produces  similar  effects. 
Noting  that  language  inhibition  was  not  produced  in  stimulation 
of  any  part  of  the  inferior  temporal  gyrus,  Luders  and  colleagues 
(1986)  concluded  that  the  basal  temporal  area  is  not  merely  a 
previously  undetected  portion  of  Wernicke's  area,  but  a  separate 
entity  that  may  work  in  conjunction  with  the  classical  language 
areas.  He  also  speculates  that  the  expressive  aphasia  associated 
with  Broca's  area  could  be  due  to  lesions  in  both  the  language  and 
motor  areas  in  that  location  while  the  communicative  aphasia 
associated  with  Wernicke's  area  occurs  because  damage  is 
confined  only  to  language  areas  (Luders,  Lesser,  Dinner,  Morris, 
Wyllie,  &  Godoy,  1988).  In  other  words,  if  only  the  language 
areas  within  Broca's  area  were  damaged,  a  comprehension  aphasia 
would  result  and  not  an  expressive  one.  Since  similar  deficits  are 
produced  in  the  three  language  areas,  he  inferred  that  they  all 
work  in  conjunction  and  that  an  extensive  direct  connection  must 
exist  between  these  areas.  The  bundle  of  fibers  called  the  arcuate 
fasciculus  does  indeed  originate  in  the  temporal  lobe  and  travel 
back  towards  Wernicke's  area  (see  Figure  2)  before  proceeding 
forward  into  the  prefrontal  cortex  near  Broca's  area  (Geschwind, 
1979).  Conduction  aphasia,  an  aphasia  involving  fluent  speech, 
poor  speech  repetition  abilities  and  intact  auditory  comprehension 
seems  to  be  an  intermediate  aphasia,  having  characteristics  of  both 
Broca's  and  Wernicke's  aphasias.  Anatomical  evidence  indicates 
that  this  third  aphasia  is  caused  by  arcuate  fasciculus  lesions, 
disrupting  the  communication  between  Wernicke's  and  Broca's 
areas  (Damasio  &  Damasio,  1980),  as  predicted  by  the 
Wernicke's  original  model  (Geschwind,  1964;  Damasio  & 
Geschwind,  1984).  A  connection  between  the  basal  temporal 
region  and  Wernicke's  area  has  not  yet  been  identified,  but  Luders 
and  colleagues  hypothesize  that  it  exists.  They  cite  Rosene  and 
Hoesen's  (1977)  work  showing  the  existence  of  direct  connections 
between  the  hippocampus  and  the  cortex  in  the  frontal  and 
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temporal  lobes  as  evidence  for  this  possibility.  Connections 
between  the  fusiform  gyrus  and  the  angular  gyrus  may  also 
provide  a  possible  indirect  connection  between  Wernicke's  area 
and  this  basal  temporal  language  area. 

If  these  three  brain  areas  work  in  conjunction,  the  question 
arises  why  Broca's  and  Wernicke's  area  were  able  to  be  identified 
so  much  more  frequently  than  the  basal  temporal  language  area. 
The  surgical  data  suggests  that  the  basal  temporal  region  is  not  a 
crucial  area  since  each  of  the  three  patients  with  a  surgically 
removed  basal  temporal  area  demonstrated  no  postoperative 
aphasia,  save  one  in  which  it  disappeared  after  six  months 
(Luders,  et  al.,  1991).  It  is  possible  that  the  function  is  bilateral 
and  that  removal  of  the  dominant  hemisphere  causes  previously 
existing  connections  on  the  nondominant  hemisphere  to  be 
reinforced,  a  process  which  requires  time.  It  would  be  interesting 
to  see  whether  a  bilateral  resection  of  the  basal  temporal  region 
would  reproduce  the  effects  of  electrical  stimulation.  Electrical 
stimulation  tests  on  the  basal  temporal  region  of  the  nondominant 
hemisphere  after  recovery  from  resection  of  the  basal  region  of 
the  dominant  temporal  lobe  may  also  provide  that  information. 

The  data  suggest  that  a  basal  temporal  language  area 
definitely  exists  in  some  patients.  Because  it  is  not  seen  in  every 
patient,  the  part  it  plays  in  the  whole  scheme  of  language 
generation  remains  partially  unsure.  Because  the  data  indicates 
that  there  is  no  problem  with  the  input  (visual  or  auditory)  stage, 
perhaps  the  interference  lies  in  the  processing  stage,  where  the 
patients  "had  no  access  to  the  verbal  engrams  which  establish  the 
link  between  symbolic  verbal  material  and  the  corresponding 
nonverbal  expressions"  (Luders  et  al.,  1991,  p.  751). 


ATTENTION  (DETECTION)  AND  LANGUAGE 

(The  Anterior  Cingulate) 


One  of  the  anterior  cingulate's  primary  functions  is  in 
attention.  In  order  to  see  the  relevance  of  this  structure  to 
language,  one  must  first  understand  the  importance  of  attention  to 
language  processes.  Thus,  the  role  of  attention  in  language  will 
be  addressed  before  a  description  of  the  anterior  cingulate's 
anatomy  is  presented. 

In  addition  to  the  processes  mentioned  above,  adult  second 
language  learners  also  use  conscious  self-monitoring  when 
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learning  the  second  language,  such  as  when  the  learner  speaks, 
listens  or  writes  (Chamot,  Kupper,  &  Impink-Hernandez,  1988a, 
1988b;  Rubin,  1981).  For  example,  in  response  to  a  verbal 
message,  several  steps  must  ensue.  In  a  greatly  simplified 
scenario,  the  auditory  signal  must  first  be  filtered  from  all  other 
stimuli  (see  Sato  &  Jacobs,  this  volume).  The  auditory  signal  is 
then  processed  in  the  brain  to  associate  meaning  with  the  sounds. 
Using  the  associated  meaning,  cognitive  planning  must  then  occur 
to  determine  the  appropriate  response  to  the  auditory  message. 
After  the  planning  occurs,  the  appropriate  brain  areas  are  recruited 
to  generate  the  response.  Monitoring  of  the  output  is  also 
necessary,  particularly  in  nonfluent  learners,  to  produce 
grammatically  correct  responses  to  the  heard  speech.  This 
monitoring  requires  that  attention  be  focussed  on  the  appropriate 
stimuli,  a  focusing  which  has  been  shown  to  be  a  function  of  the 
anterior  cingulate.  The  anterior  cingulate  may  then  be  viewed  as  a 
participant  in  the  self-evaluation  process.  The  prefrontal  cortex 
has  been  implicated  in  cognitive  planning  and  organization  and 
thus  consequently  also  seems  to  play  an  important  role  in 
language.  Although  children  seem  to  use  these  cognitive 
functions  in  first  language  acquisition,  the  prefrontal  cortex  is  not 
completely  developed  during  the  time  of  the  acquisition,  thus 
complicating  the  picture.  Neurons  in  the  prefrontal  cortex 
(specifically  pyramidal  cells  in  cortex  layer  III)  continue  to 
enlarge,  both  in  cell  body  size  and  branching  of  the  neuron 
projections,  until  approximately  ten  years  of  age  (Mrzljak, 
Uylings,  Van  Eden  &  Judas,  1990).  Neurons  in  the  prefrontal 
cortex  are  also  among  the  last  neurons  in  the  brain  to  myelinate 
(coat  themselves  with  a  fat-like  substance)  (Fuster,  1980). 
Because  the  extensiveness  of  neuron  projections  and  myelination 
facilitate  information  exchange  and,  thus,  processing,  the 
prefrontal  cortex  cannot  be  considered  fully  functional  until 
maturation  is  complete.  Correspondingly,  the  functions  of  the 
prefrontal  cortex  are  more  easily  defined  in  the  adult  second 
language  learner  (whose  cortex  has  completed  maturation).  Thus 
our  discussion  will  focus  on  the  role  of  the  prefrontal  cortex  and 
anterior  cingulate  in  second  language  learning  and  production, 
rather  than  first  language  acquisition. 

The  acquisition  and  production  of  a  second  language  in 
adults  involve  a  general  attention  system.  For  the  purpose  of  this 
paper,  attention  is  defined  as  "the  ability  to  select  or  focus  on  a 
small  fraction  of  the  incoming  sensory  information"  (Corbetta, 
Miezin,  Dobmeyer,  Shulman  &  Petersen,  1991).    Within  this 
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definition,  attention  may  be  parsed  into  three  functions  (Posner  & 
Petersen,  1990):  1)  orientation  to  sensory  information,  2) 
detection  of  specified  information  for  processing,  and  3) 
maintenance  of  alertness.  Only  detection  will  be  dealt  with  here 
since  it  is  involved  in  language  and  self-evaluation. 

Detection  is  defined  as  the  identification  of  a  targeted  item 
(Posner  &  Petersen,  1990).  That  target  includes  information  in 
stored  memory  as  well  as  sensory  data.  In  the  process  of  language 
production  and  acquisition,  detection  provides  the  means  by 
which  a  learner  can  monitor  his  language  production.  For 
example,  using  Bialystok's  (1978)  model  of  second  language 
learning,  consider  a  nonfluent  adult  learning  Spanish  as  a  second 
language.  When  speaking,  the  learner  will  consciously  identify 
the  subject  of  the  sentence  and  then  mentally  review  a  list  of  verb 
forms  to  find  the  appropriate  one,  before,  during,  and  after 
generating  the  sentence.  Krashen's  (1981)  monitor  model  posits 
the  use  of  conscious  self-evaluation  when  actively  studying  to 
learn  a  language,  but  not  during  "acquisition,"  the  unconscious 
learning  of  a  language.  Detection  may  then  be  viewed  as 
monitoring  in  Krashen's  monitor  model.  This  detection  also 
allows  the  learner  to  evaluate  the  responses  of  the  listener. 

The  generalized  attention  system,  under  which  detection 
falls,  may  be  viewed  as  two  smaller  entities,  the  posterior  and 
anterior  attention  systems  (Posner  &  Petersen,  1990).  It  is  the 
anterior  system  that  plays  a  key  role  in  detection  during  cognitive 
functions,  which  include  language.  Within  the  anterior  attention 
system,  one  of  the  main  players  is  the  anterior  cingulate,  thus,  it 
will  be  the  focus  of  the  discussion  on  attention  (detection)  and 
language. 


ANATOMY  OF 
THE  ANTERIOR  CINGULATE  GYRUS 


The  anterior  cingulate  gyrus  is  an  outward  fold  in  the 
cortex,  located  on  the  medial  surfaces  of  the  frontal  lobes  in  both 
hemispheres  of  the  brain.  The  cingulate  gyrus  curves  around  the 
corpus  callosum,  the  bundle  of  fibers  connecting  the  two 
hemispheres  (See  Figure  3). 

The  distinction  between  the  anterior  and  posterior  regions 
of  the  cingulate  is  important  since  the  two  regions  have  differing 
input  connections  (Vogt,  1985)  and  functions  (Vogt,  1987).  The 
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posterior  cingulate  is  involved  in  pain  and  reactions  to  noxious 
stimuli.  The  anterior  cingulate  is  the  portion  of  the  gyrus  to  which 
the  language  functions  are  being  ascribed. 

The  anteriomedial  and  mediothalamic  nuclei  provide  the 
majority  of  thalamic  input  for  the  anterior  cingulate  (Vogt,  1985). 
The  anterior  cingulate  also  receives  input  from  higher  order 
sensory  cortex  (not  primary  sensory  cortex)  and  sends  output  to 
the  prefrontal  cortex  (Kupfermann,  1991).  Existence  of 
connections    between    the   prefrontal    cortex  and  the    anterior 
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MEDIAL  VIEW  OF  THE  BRAIN 

Figure  3.  Medial  View  of  the  right  side  of  the  brain.  The  anterior 
cingulate  is  the  fold  of  brain  tissue  just  above  the  corpus 
callosum.  The  anterior  portion  of  the  cingulate  is  pointed 
out  above. 


cingulate  would  increase  the  possibility  that  they  affect  each  other 
during  language  processing,  if  that  is  shown  to  be  a  function  in 
either  location.  As  the  main  input  to  the  anterior  cingulate,  the 
mammilothalamic  tract  will  also  prove  to  be  important. 
Consequently,  the  mammilothalamic  tract  and  the  connection  with 
the  prefrontal  cortex  are  central  to  the  discussion  presented  below. 
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THE  ROLE  OF  THE  ANTERIOR  CINGULATE 
IN  DETECTION/SELECTION 

Regional  blood  flow  in  the  brain  has  been  shown  to 
increase  in  the  anterior  cingulate  during  single  word  association 
tasks  (Petersen,  Fox,  Posner,  Mintun  &  Raichle,  1988).  These 
tasks  required  normal  subjects  to  generate  a  verb  semantically 
related  to  a  visually  presented  noun  (e.g.,  "eat"  when  presented 
with  "food").  Subjects  were  also  asked  to  monitor  lists  of  words, 
identifying  those  belonging  to  a  particular  semantic  category  (e.g., 
searching  a  word  list  for  different  foods).  PET  data  indicated 
activation  of  the  anterior  cingulate  in  both  tasks.  Interestingly, 
the  amount  of  activation  seen  in  the  anterior  cingulate  was  found 
to  be  greater  for  lists  containing  many  words  from  the  selected 
categories  (Petersen  et  al.,  1988;  Petersen,  Fox,  Posner,  Mintun  & 
Raichle,  1989).  Activation  in  a  part  of  the  left  prefrontal  lobe  was 
also  observed.  However,  similar  blood  flow  increases  were  not 
seen  during  visual  presentation  alone  (no  response  required  from 
the  subject),  nor  during  an  output  task  in  which  the  subject 
verbally  repeated  the  word  presented  on  the  screen. 

The  verb  generation  task  included  speech  (reporting  the 
verb),  visual  processing,  and  semantic  association  between  a  verb 
and  a  noun  and  possibly  some  type  of  grammatical  computation  in 
identifying  the  part  of  speech.  The  amount  of  activation  in  each 
task  was  determined  by  subtracting  the  numerical  values  of 
activation  determined  by  PET  from  the  PET  results  of  nearly 
identical  tasks,  differing  only  in  one  aspect  of  the  task.  For 
example,  PET  values  of  a  person  viewing  and  reading  a  printed 
noun  were  subtracted  from  a  PET  of  one  viewing  a  noun  and 
speaking  an  associated  verb.  The  semantic  association  is  the  only 
difference  between  the  two  tasks  (Petersen  et  al.,  1989).  The 
results  showed  that  the  anterior  cingulate  and  lateral  prefrontal 
cortex  were  active  and  suggests  that  they  were  involved  in  that 
association.  The  anterior  cingulate  was  also  implicated  in  the 
detection  and  selection  of  words  because  its  activation  only 
increased  on  the  word  lists  tasks  with  more  target  words  (Posner, 
Petersen,  Fox  &  Raichle,  1988;  Petersen  et  al.,  1988;  Petersen  et 
al.,  1989).  In  contrast,  the  lateral  prefrontal  cortex  activation  did 
not  vary  with  the  number  of  targets  in  the  lists. 

The  fact  that  activation  of  the  anterior  cingulate  is  stronger 
in  searching  lists  with  more  targets  strongly  suggests  that  the 
cingulate  is  involved  in  detection.  The  only  difference  between 
the  tasks  of  searching  a  list  with  fewer  targets  and  searching  a  list 
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with  more  targets  is  the  detecting  of  the  targets  and  generating 
responses,  the  latter  also  being  considered  a  function  of  the 
anterior  cingulate  in  detection  or  "attention  for  action"  (Posner  et 
al.,  1988),  meaning  that  the  anterior  cingulate  selects  a  processing 
center  or  a  course  of  response  for  the  stimuli. 

Pardo,  Pardo,  Janer,  and  Raichle  (1990)  have  also  studied 
the  anterior  cingulate  in  a  verbal  task.  Their  task  was  constructed 
under  the  Stroop  attentional  conflict  paradigm  in  which  subjects 
had  to  resolve  the  interference  between  word  reading  and  color 
naming  (Pardo  et  al.,  1990).  For  example,  subjects  were 
presented  the  word  "red"  in  green  letters  and  asked  to  name  the 
color  of  the  letters  as  quickly  as  possible.  Although  the  Stroop 
task  is  different  from  the  Petersen,  Fox,  Posner,  Mintun,  and 
Raichle  (1988)  experiments,  both  tasks  require  a  shunning  of  the 
tendency  to  read  the  noun  presented  while  generating  another 
word.  Although  it  does  not  require  a  selection/detection  of 
external  data,  the  Stroop  task  demands  selection  of  the  relevant 
processing  center  to  which  the  information  is  sent  (Pardo  et  al., 
1990).  PET  data  from  these  Stroop  conflict  tests  show  the 
greatest  activation  to  be  in  the  anterior  cingulate.  This  lends 
support  to  the  idea  that  the  anterior  cingulate  participates  in  the 
selection  of  a  processing  center  or  cognitive  operations. 

PET  scans  performed  by  Corbetta,  Miezin,  Dobmeyer, 
Shulman,  and  Petersen  (1991)  have  indicated  that  the  anterior 
cingulate  is  active  not  only  in  language  specific  tasks,  but  also 
during  tasks  requiring  complex  cognitive  functioning.  Their 
experiments  examined  two  situations:  a  selective  attention  and  a 
divided  attention  task.  In  the  selective  attention  task,  subjects 
were  required  to  detect  the  changes  in  a  single,  specific  feature 
(size,  color,  shape  or  speed)  of  moving  blocks  in  visual  screen. 
Other  features  were  either  constant  or  also  shifting  during  the 
tests.  Alterations  in  features  not  specified  by  the  researchers  were 
to  be  ignored  during  the  selective  attention  task.  In  contrast,  the 
divided  attention  task  required  detection  of  a  change  in  any 
feature;  no  feature  was  specified  for  the  subject  to  focus  on. 

A  number  of  brain  regions  were  activated  in  both  tasks; 
but  only  in  the  divided  attention  task  was  the  anterior  cingulate 
activated.  The  prefrontal  cortex  was  also  activated  in  the  divided 
task.  The  data  indicate  that,  cognitively,  the  brain  views  the  two 
tasks  "as  qualitatively  different"  (Corbetta  et  al.,  1991,  p.  2398). 
The  selective  attention  task  necessitated  the  maintenance  of  a 
focus  on  a  specified  feature.  However,  the  divided  attention  task 
did  not  involve  a  preset  focus,  but  rather  imposed  greater  demands 
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in  the  coordination  and  comparison  of  information.  Corbetta, 
Miezin,  Dobmeyer,  Shulman,  and  Petersen  (1991)  surmise  that  all 
the  sensory  information  regarding  the  various  features  is 
processed  in  the  prefrontal  cortex,  which  then  sends  data  to  the 
anterior  cingulate  for  response  selection. 

That  the  anterior  cingulate  functions  preferentially  in 
cognitive  tasks  was  demonstrated  in  an  experiment  by  Pardo,  Fox 
and  Raichle  (1991).  The  tasks  involved  were  solely  sustained 
attention  tasks  for  somatosensory  stimuli  (e.g.,  noticing  touches 
on  a  toe  or  changes  in  the  intensity  of  a  light).  The  authors 
concluded  that  the  task  did  not  require  use  of  "high-level 
processing  selection  systems  necessary  for  the  analysis  of 
complex  targets"  (p.  63),  and  thus  did  not  require  the  anterior 
cingulate' s  usage. 

Raichle  (1990)  has  noted  that  activation  in  the  prefrontal 
cortex  and  the  anterior  cingulate  during  Petersen,  Fox,  Posner, 
Mintun,  and  Raichle's  (1989)  verb  generation  task  was  only  found 
when  the  task  was  first  presented  to  the  subjects.  As  a  "new" 
task,  it  required  active  attention.  However,  after  the  task  had  been 
rehearsed  (e.g.,  practicing  the  generation  of  verbs  for  one 
specified  list  of  nouns  and  being  tested  for  the  same  list), 
activation  of  the  prefrontal  cortex  and  anterior  cingulate  was  no 
longer  seen.  In  other  words,  after  the  task  had  been  automated  by 
continuous  practice,  the  attention  was  no  longer  needed. 
Language  acquisition  is  similar;  once  the  grammatical  structures 
are  "acquired",  they  become  automatic.  The  anterior  cingulate 
may  have  a  role  in  language  acquisition  through  detecting  features 
or  selecting  responses  when  the  situations  are  still  novel. 
Processing  for  the  responses  occurs  elsewhere  when  the  responses 
have  been  practiced  enough  for  the  cerebellum  to  ingrain  them 
into  procedural  memory. 


ANATOMY  OF  THE  PREFRONTAL  CORTEX 


The  prefrontal  cortex  is  located  in  the  anterior  lateral 
portions  of  the  cerebral  cortex,  corresponding  to  Brodmann's 
(1909)  areas  9,  10,  11  and  46  (see  figure  1).  The  connectivity  of 
the  prefrontal  cortex  extends  throughout  the  brain.  This  high 
degree  of  connectivity  would  allow  it  to  play  the  role  of  integrator 
of  sensory  information  in  the  divided  attention  task  mentioned 
above  (Corbetta  et  al.,  1991).  For  the  present  purposes,  the  most 
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important  connections  are  with  the  thalamus  and  the  anterior 
cingulate. 


COGNITIVE  FUNCTIONS  OF  THE  PREFRONTAL 

CORTEX 


Lesion  studies  of  patients  have  indicated  that  the  prefrontal 
cortex  is  involved  in  cognitive  planning.  Lesions  in  the  prefrontal 
area  lead  to  a  disorder  termed  "central  motor  aphasia"  (Goldstein, 
1948),  which  is  characterized  by  slowed  spontaneous  speech  and  a 
low  level  of  expressivity.  Jackson  (1915)  reports  that  the  patients 
speak  in  short,  simple  sentences,  almost  as  if  their  ability  to  use 
complex  sentences  (e.g.,  those  with  subordinate  clauses)  is 
reduced.  Fuster  (1985)  concludes  that  these  lesions  impair  the 
ability  to  organize  complex  language  and  to  sequence  the  shorter 
clauses  for  the  longer  sentences.  Stuss  and  Benson  (1984)  report 
four  main  behavioral  deficits  as  a  result  of  prefrontal  damage:  a) 
an  inability  to  use  knowledge  about  a  task  to  complete  a  task,  b) 
inability  to  monitor  behavior  for  errors  and  to  use  the  errors  to 
modify  behavior,  c)  inability  to  establish  a  set  (frame  of 
reference),  and  d)  impaired  ability  to  perform  sequential  tasks. 
Citing  the  connections  between  the  frontal  and  prefrontal  cortex, 
McGrath  (1991)  speculates  that  breakdowns  in  the  looping 
connections  between  the  two  areas  cause  the  observed 
impairment,  much  of  which  affects  the  quality  of  language 
produced. 

Novoa  and  Ardila  (1987)  report  that  patients  with 
prefrontal  lesions  retain  the  formal  aspects  of  language  such  as 
lexical  and  phonetic  knowledge.  However,  patients  were  slow 
with  verbal  tasks  and  showed  signs  of  perseveration,  free 
association  of  ideas,  and  noticeable  failures  in  verbal  memory. 
Consequendy,  Novoa  and  Ardila  suggest  that  the  lesions  cause 
loss  or  impairment  of  the  capacity  to  elaborate,  or  express 
decisions  about  language. 

A  recent  study  by  Kelly,  Best  and  Kirk  (1989)  on  the 
reading  abilities  of  boys  also  implicates  the  prefrontal  cortex  in 
problem  solving  and  selective  attention.  They  examined  a  number 
of  cognitive  functions  considered  to  be  exclusively  the  domain  of 
the  prefrontal  cortex  and  some  functions  considered  to  be  the 
domain  of  the  posterior  cortical  areas.  The  subjects  of  their  study 
were  boys  who  either  were  able  to  read  (control  group)  or  had 
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been  identified  as  having  difficulty  learning  to  read.  They  found 
that  those  with  a  hindered  ability  to  read  or  to  learn  to  read 
showed  a  deficit  in  prefrontal  functions.  These  boys  did  not 
consistently  use  new  information  to  reformulate  hypotheses  in  a 
problem  solving  task,  nor  were  they  able  to  maintain  attention  to 
certain  aspects  required  by  the  tasks,  such  as  color  of  the  letters  in 
the  Stroop  test  (see  above).  They  conclude  that  the  "reading 
disabled  youngsters  have  difficulty  with  cognitive  processes 
involving  sustained  attention,  inhibition,  set  maintenance  [keeping 
a  frame  of  reference],  and  flexibility  in  generating  alternative 
hypotheses"  (p.  289). 

Although  the  above  research  indicates  that  the  prefrontal 
cortex  is  involved  in  planning  and  using  feedback  in  planning 
responses,  recent  PET  evidence  indicates  some  language- specific 
functions  may  be  attributed  to  the  prefrontal  cortex  as  well.  In  the 
divided  attention,  but  not  the  selective  attention,  task  mentioned 
above  (Corbetta  et  al.,  1991),  the  prefrontal  area  was  identified  as 
an  active  area  along  with  the  anterior  cingulate.  Corbetta,  Miezin, 
Dobmeryer,  Shulman,  and  Petersen,  (1991)  attribute  processing 
for  complex-tasks  to  this  area  because  it  is  very  close  to  the  areas 
found  to  be  active  in  the  semantic  association  tasks  of  Petersen, 
Fox,  Posner,  Mintun,  and  Raichle.  (1988). 

Petersen,  Fox,  Snyder,  and  Raichle  (1990)  have  also 
presented  evidence  that  the  prefrontal  cortex  participates  in 
language  functions.  Their  PET  scan  study  indicates  that  the 
prefrontal  cortex  is  active  when  real  words  are  presented  visually, 
while  meaningless  symbols  or  nonword-like  strings  (e.g.,  sweed) 
do  not  cause  similar  activations.  Petersen,  Fox,  Snyder,  and 
Raichle  (1990)  argue  that  this  area  is  involved  in  semantic 
association  since  the  word  form  is  similar  in  word-like  strings  of 
letters  and  true  words,  the  only  difference  being  the  association  of 
a  meaning  with  the  word  and  none  with  the  word-like  form. 

The  available  data  on  the  prefrontal  cortex  indicates  that  it 
is  involved  in  planning  cognitive  tasks  and  in  evaluating  behavior. 
PET  studies  indicate  a  semantic  association  function  as  well.  In 
relation  to  adult  second  language  acquisition  then,  the  prefrontal 
cortex  may  participate  in  planning  and  editing  language  from 
sentence  level  grammar  up  to  cohesion  in  discourse  (McGrath, 
1991).  This  planning  and  editing  function  can  provide  a  plan  for 
evaluation  which  both  it  and  the  anterior  cingulate  may  perform. 
Such  an  evaluation  could  then  serve  as  the  criteria  that  the  anterior 
cingulate  uses  to  determine  the  next  area  in  the  brain  where 
further  processing  will  occur  (e.g.,  the  motor  areas  for  producing  a 
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spoken  reply  to  a  call).  The  prefrontal  planning  role  would  only 
be  in  effect  during  the  acquisition  of  the  second  language  in  adults 
since  automating  some  of  these  grammar  and  discourse  plans  due 
to  constant  practice  would  relate  responsibility  of  their 
performance  to  the  cerebellum,  to  which  both  the  prefrontal  cortex 
and  the  anterior  cingulate  are  connected  (Leiner,  Leiner  &  Dow, 
1989). 


SYNTHESIS 


The  fact  that  language-related  functions  can  be  localized  in 
these  three  previously  unrelated  structures  indicates  that  a  whole- 
brain  view  of  language  may  be  in  order.  If  the  BTL  area,  the 
prefrontal  cortex,  and  the  anterior  cingulate  have  long  remained 
unassociated  with  language  functions,  the  probability  that  other 
areas  of  the  brain  which  contribute  to  language  could  also  go 
unacknowledged  by  neurolinguists  is  very  high.  While  the 
Wernicke-Geschwind  model  has  proved  useful  in  language/brain 
research,  current  neurobiological  knowledge  has  shown  it  to  be 
oversimplified.  Because  the  Wernicke-Geschwind  model 
presupposes  a  language-specific  system  in  the  brain,  it 
automatically  narrows  the  focus  to  brain  regions  near  or  direcdy 
related  to  Wernicke's  or  Broca's  area.  It  thus  directs  research 
away  from  areas  in  which  lesion-caused  language  deficiencies 
have  not  been  noted  yet,  areas  which  may  later  prove  to  be  crucial 
to  language  research. 

Although  the  existence  of  areas  outside  the  Wernicke 
model  clouds  the  picture  for  explaining  language,  it  points  to  a 
wide  range  of  opportunities  for  further  research  in  the 
brain/language  field.  The  possibilities  of  identifying  brain  regions 
active  in  language  and  yet  traditionally  associated  with  other 
functions,  such  as  the  amygdala  which  functions  in  emotion 
(Schumann,  1990),  hold  out  the  chance  that  identification  could 
lead  to  a  greater  understanding  of  the  nature  of  second  language 
learning  and  production,  and  even  language  itself. 
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NOTES 


1  In  electrical  stimulation  studies,  electrodes  are  implanted  in  the  brain  over 
the  area  to  be  studied.  Application  of  an  electrical  current  there  disrupts  the  normal 
electrical  activity  of  the  neurons  and  thus  inhibits  their  functioning.  Behavioral 
correlations  to  these  structures  can  be  made  during  stimulations,  which,  in  effect, 
produce  a  "temporary"  brain  lesion. 

2  The  Wechsler  Adult  Intelligence  test  measures  cognitive  functioning. 

3  The  Wada  test  involves  injecting  sodium  amytal  into  the  blood  supply  of 
either  the  right  or  the  left  hemisphere  of  the  brain.  Since  it  is  a  barbiturate  and  will 
slow  down  normal  brain  functions,  sodium  amytal  determines  which  side  of  the  brain  is 
language  dominant  by  identifying  the  hemisphere  in  which  language  functions  are 
hindered  upon  injection. 
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A  Neurobiological  Model  of  Procedural 
Linguistic   Skill   Acquisition 
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This  paper  presents  a  neurobiologically  inspired  model  of  one  aspect  of 
adult  second  language  acquisition  (SLA):  procedural  linguistic  skill  acquisition. 
Procedural  linguistic  skills  are  defined  as  the  speaker/learner's  implicit, 
unstatable  knowledge  regarding  the  formal  linguistic  (i.e.,  syntactic, 
phonological,  and  morphological)  properties  of  the  second  language  (L2). 
Unlike  declarative  linguistic  knowledge  (i.e.,  semantic  and  lexical  knowledge  and 
explicit  knowledge  of  the  L2  linguistic  system),  which  can  be  readily  displayed 
through  verbal  report  or  description,  procedural  linguistic  skills  are  best 
demonstrated  through  performance.  The  proposed  acquisition  model  crucially 
involves  the  neural  circuitry  of  the  neocerebellum.  The  neocerebellum  is  a  brain 
structure  which,  although  traditionally  associated  with  purely  motor  activity,  has 
recently  been  implicated  in  higher  cognitive  and,  potentially,  linguistic 
functions.  The  model  provides  for  a  potential  unification  of  the  competing 
cerebral  (Ojemann,  1991 ;  Loritz,  1991)  and  cerebellar  (Rumelhart  &  McClelland, 
1986:  Sokolik,  1990)  theories  of  linguistic  function  by  integrating  the  unique 
contributions  of  both  regions  of  the  cerebral  cortex  (e.g.,  Broca's  expressive 
speech  area  and  the  prefrontal  cortex  responsible  for  cognitive  planning  and 
monitoring  functions)  and  regions  of  the  cerebellum  (an  enormous  capacity 
parallel  processor  responsible  for  the  integration  of  cognitive  and  sensory 
information).  The  proposed  model  also  offers  a  principled  account  of  how 
explicit  formalized  grammar  instruction  might  potentially  serve  as  an  effective 
metacognitive  strategy  for  the  L2  learner's  acquisition  of  procedural  linguistic 
skills. 
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INTRODUCTION 


This  paper  presents  a  neurobiological  model  of  one  aspect  of 
adult  second  language  acquisition  (SLA):  procedural  linguistic  skill 
acquisition.!  Procedural  linguistic  skills  are  defined  for  present 
purposes  as  the  speaker/learner's  implicit,  unstatable  knowledge  of 
the  structure  and  form  of  the  second  language  (L2),  including 
knowledge  of  the  so-called  "abstract  rules"  of  the  syntax, 
phonology,  and  morphology  of  the  L2.  Procedural  linguistic  skills 
concern  those  aspects  of  the  L2  linguistic  system  which 
speaker/learners  "know"  only  in  the  sense  that  they  are  able  to 
produce  grammatical  strings  in  the  L2  which  both  reflect  and  obey 
these  underlying  rules,  principles  and  constraints;  "naive" 
speaker/learners  are  largely  unable  to  describe  this  knowledge  in 
significant  detail  or  with  much  accuracy.  I  refer  to  this  type  of 
"knowledge"  as  a  skill  precisely  because  it  is  best  demonstrated 
through  performance,  rather  than  through  verbal  or  written  report. 
Not  all  knowledge  about  language,  however,  is  implicit  and 
unstatable;  speaker/learners  also  have  a  significant  amount  of  explicit 
knowledge  about  the  L2.  This  explicit  and  statable  knowledge  is 
referred  to  as  declarative  linguistic  knowledge;  examples  include 
lexical  and  semantic  knowledge,  and  explicit  formally  learned 
knowledge  of  the  syntactic,  phonological,  and  morphological 
properties  of  the  L2.  As  an  illustrative  example  consider  the 
phonological  and  morphological  knowledge  that  speaker/learners  of 
English  have  about  the  noun  "house."  They  possess  explicit 
declarative  knowledge  that  the  plural  form  of  the  noun  is  "houses" 
(/hauziz/),  but  they  also  have  implicit  procedural  linguistic 
knowledge  of  the  Obligatory  Contour  Principle  (a  constraint  which 
essentially  forbids  adjacent  identical  elements  or  features  within  a 
phonological  constituent)  which,  in  this  case,  forces  the  epenthesis 
(insertion)  of  the  default  vowel  /I/  to  separate  the  two  adjacent 
identical  consonants.2 

Essentially,  the  proposed  model  assumes  that  the  acquisition 
of  procedural  linguistic  skills  in  an  L2  involves  the  gradual,  stage- 
wise  formulation  and  refinement  of  detailed  execution  programs 
within  the  neural  circuitry  of  the  neocerebellum  and  related 
structures.  Adult  procedural  linguistic  skill  acquisition  is 
represented  within  this  model  as  the  operationalization  of  "abstract" 
or  conceptual  linguistic  plans  originating  in  Broca's  expressive 
speech  region  under  the  monitoring  and  strategic  planning  influence 
of  the  frontal  cortex.    Through  its  integration  of  diverse  brain 
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Structures  including  regions  of  both  the  cerebral  hemispheres  and  the 
cerebellum  the  model  presented  here  offers  a  potential  unification  of 
the  currently  competing  cerebral  (Ojemann,  1991;  Loritz,  1991)  and 
cerebellar  (Rumelhart  &  McClelland,  1986:  Sokolik,  1990)  models 
of  linguistic  function. 

I  would  like  to  begin  this  paper  by  making  explicit  a  number 
of  assumptions  and  theoretical  preferences  which  underlie  the 
proposed  model  of  adult  L2  procedural  linguistic  skill  acquisition. 
Then,  I  will  briefly  describe  the  neurobiological  processes  involved 
in  the  acquisition  and  storage  of  knowledge  (i.e.,  learning  and 
memory)  and  offer  a  general  sketch  of  a  larger  inclusive 
neurobiological  model  of  SLA  into  which  the  present  model  of 
procedural  linguistic  skill  acquisition  might  fit.  Finally,  I  will 
present  the  model  and  discuss  the  potential  contributions  of  this 
avenue  of  research  to  the  overall  understanding  of  the  processes 
involved  in  adult  SLA. 

Underlying  Assumptions  of  Proposed  Model 

First,  the  proposed  model  is  neurobiologically  inspired;  it 
relies  crucially  on  Squire's  neurobiological  theory  of  memory 
(Squire,  1982,  1985,  1986,  1987;  Squire  &  Zola-Morgan,  1991) 
and  is  based  upon  neurobiological  models  of  voluntary  motor 
activity  (Ghez,  1991)  and  procedural  motor  skill  learning 
(McCormick  &  Thompson,  1984;  Thompson,  1986,  1989; 
Harrington,  Haaland,  Yeo  &  Marder,  1990;  Bloedel,  Bracha,  Kelly 
&  Wu,  1991;  Greenough  &  Anderson,  1991).  As  Jacobs  & 
Schumann  (1992)  argue,  it  is  important  that  any  model  or  theory 
which  purports  to  account  for  language  acquisition  (either  primary 
or  second)  be  at  least  neurobiologically  plausible.  If  we  are  ever 
ultimately  to  understand  how  human  language  is  acquired,  we  must 
begin  to  consider  how  the  human  brain,  given  what  we  know  of  its 
anatomical  structure  and  its  physiological  function,  might  acquire 
language. 

Secondly,  much  of  the  neuroscientific  research  upon  which 
the  present  model  is  based  is  concerned  with  non-linguistic  learning 
and  memory  (i.e.  the  acquisition  and  storage  of  knowledge)  in  both 
non-human  and  human  subjects.  I  maintain,  however,  that  it  is 
valid  to  build  a  model  of  linguistic  skill  acquisition  upon  this 
research  for  the  following  reasons.  First,  a  number  researchers 
have  argued  that  adult  SLA  is  in  many  ways  similar  to  the 
acquisition  of  other  complex  cognitive  skills  and  is,  to  a  significant 
extent,  dependent  upon  "general"  cognitive  learning  processes 
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which  are  not  specific  to  language  (Bialystok  &  Ryan,  1985;  Faerch 
&  Kasper,  1985;  McLaughlin,  1987;  O'Malley  &  Chamot,  1990). 
Second,  although  much  current  neuroscientific  research  is 
performed  on  non-human  subjects,  the  findings  are  to  a  surprising 
degree  generalizable  to  human  subjects  and  the  data  available  from 
cross-species  comparisons  support  the  notion  that  the  ''fundamental 
neurobiological  structure  and  principles  remain  the  same  across 
mammalian  species"  (Jacobs  &  Schumann,  1992:  285,  emphasis 
theirs).  I  want  to  emphasize,  however,  that  although  I  agree  with 
Klein's  assertion  that  "the  capacity  to  acquire  and  use  a  language  is  a 
species-specific  genetic  endowment"  (Klein,  1990:  219),  the 
present  model  makes  no  assumptions  regarding  the  issue  of  the 
innateness  of  linguistic  ability  in  humans  and  is  entirely  consistent 
with  both  the  environmentalist  (cf.  Jacobs,  1988;  Greenfield,  1991) 
and  the  nativist-constructivist  (cf.  Grain,  1991;  Karmiloff- Smith  & 
Johnson,  1991)  views  of  language  acquisition. 

Third,  the  present  model  of  procedural  linguistic  skill 
acquisition  presupposes  a  larger  model/theory  of  SLA  in  which  the 
acquisition  of  competence  in  an  L2  is  assumed  to  involve  the 
acquisition  of  at  least  the  following  four  distinct  components:  a 
motor  skill  component  responsible  for  phonetic  speech  output;  a 
general  cognitive  component  concerned  with  cognitive  skills  related 
to  the  use  of  language  which  are  not  specifically  linguistic,  such  as 
reasoning,  development  of  plans  for  behavior,  and  the  strategic  use 
of  available  resources  and  capacities  to  achieve  a  goal;  a  declarative 
linguistic  skill  component  which  consists  of  the  speaker/learner's 
explicit  knowledge  of  the  linguistic  system;  and  a  procedural 
linguistic  skill  component  which  comprises  the  speaker/learner's 
implicit  knowledge  of  the  structure  and  form  of  the  language. 

Fourth,  although  this  model  presumes  the  existence  of  a 
localized  and  distinct  neural  system  devoted  to  language  function, 
the  proposed  system  is  less  modular  and  restrictive  than  traditional 
neurolinguistic  models  such  as  those  presented  by  Geschwind 
(1970)  and  Ojemann  (1987)  which  concern  themselves  primarily 
with  strictly  defined  regions  of  the  left  cerebral  hemisphere  (i.e. 
Broca's  and  Wernicke's  areas).  The  present  model  postulates  a 
neural  system  which  is  self-contained  yet  distributed  within  a 
circular  loop  across  several  distinct  brain  regions  including  both  the 
cerebral  and  the  cerebellar  hemispheres  (for  discussion  of  the  role  of 
additional  brain  structures  in  language  function  see  Lem,  this 
volume  and  Sato  &  Jacobs,  this  volume).  The  present  model  is  not 
only  a  more  plausible  representation  of  the  functional  organization  of 
the  brain  than  the  strictly  modular  traditional  neurolinguistic  models^ 
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but  also  offers  a  possible  compromise  between  two  currently 
competing  theories  of  linguistic  representation:  the  symbolist,  or 
cerebral  theories  proposed  by  researchers  such  as  Loritz  (1991)  and 
Ojemann  (1987)  and  the  connectionist,  or  cerebellar  theories  such  as 
those  of  Rumelhart  and  McClelland  (1986)  and  Sokolik  (1990).4 

Neurobiology  of  Learning  and  Memory 

Memory  is  assumed  to  consist  of  information,  or 
knowledge,  which  is  stored  and  retrieved  through  the  patterns  of 
synapses  (i.e.,  communicative  connections  between  neurons) 
existing  within  a  given  neuronal  network  (Squire,  1987; 
Kupfermann,  1991;  Thompson,  1987).  Knowledge  is  stored,  or 
acquired,  through  local  changes  occurring  within  a  particular  neural 
network.  Local  changes,  which  constitute  the  neural  mechanism  for 
learning,  may  involve  either  morphological  or  chemical  alterations. 
Morphological  alterations  include  the  formation  of  new  synapses 
and  the  structural  modification  of  preexisting  synapses.  Chemical 
changes  involve  the  alteration  of  the  membrane  properties  of 
neurons,  which  in  turn  may  influence  the  functional  properties  of 
potential  or  preexisting  synapses. 

Given  that  learning  involves  the  formation  of  new  synapses 
and/or  the  morphological  or  chemical  modification  of  synapses,  the 
acquisition  of  novel  information  crucially  depends  upon  the 
existence  of  "plasticity"  within  the  relevant  neuronal  circuitry. 
Plasticity  is  defined  as  the  capacity  of  a  given  neuronal  network  to 
create  new  synaptic  connections  or  modify  preexisting  ones  in 
response  to  novel  input  from  the  environment,  either  external  or 
internal  (i.e.,  the  capacity  to  leam).  Plasticity  has  been  documented 
within  numerous  neural  systems  including  those  relevant  to  the 
present  discussion:  the  cerebral  cortex  and  the  cerebellum  (Purves 
&  Litchman,  1980);  the  hippocampus  and  related  cortical  areas 
(Squire  &  Zola-Morgan,  1991);  and  the  red  nucleus,  a  brain  stem 
structure  receiving  massive  projections  from  the  neocortex  (the  most 
recently  evolved  portion  of  the  cerebral  cortex)  (Tsukahara,  1984). 
The  importance  of  plasticity  within  each  of  these  functional  neural 
systems  will  be  discussed  in  greater  detail  in  later  sections. 
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OUTLINE  OF  LARGER  INCLUSIVE  MODEL  OF 
ADULT  SLA 

Motor  Speech  Component 

One  aspect  of  SLA  involves  the  acquisition  and  fine-tuning 
of  the  purely  motor  skills  required  for  fluent  and  accurate  phonetic 
speech  production.  The  realization  of  the  L2  linguistic  system  as 
phonetic  speech  output  requires  the  formulation  and  automaticization 
of  highly  detailed  motor  programs  that  encode  the  precisely  timed 
and  coordinated  neuronal  impulses  which  ultimately  result  in  a 
complex  set  of  muscle  movements.  Although  the  motor  neurons 
which  directly  innervate  the  muscles  of  the  speech  organs  are  located 
in  Brodmann's  areas  4  &  6  of  the  cerebral  cortex^,  research 
suggests  that  the  cerebellum  plays  an  integral  role  in  the  acquisition 
and  orchestration  of  motor  speech  activity  (Figure  1). 

Although  research  in  this  area  remains  speculative,  it  is 
generally  accepted  that  the  basal  ganglia,  the  cerebellum,  and  related 
neural  circuitry  are  responsible  for  the  acquisition  and  storage  of 
certain  types  of  procedural  motor  skills  (Harrington,  Haaland,  Yeo 
&  Marder,  1990;  Bloedel,  Bracha,  Kelly  &  Wu,  1991;  Ghez, 
1991).  The  structures  of  the  basal  ganglia  are  reportedly  involved  in 
the  facilitation  and  inhibition  of  movement,  as  well  as  the  regulation 
of  movement  speed  (Ghez,  1991);  the  cerebellum  is  reportedly 
responsible  for  the  acquisition  and  storage  of  the  detailed  motor 
activity  programs  which  underlie  a  restricted  subset  of  procedural 
motor  slalls:  those  which  crucially  require  the  neural  circuitry  of  the 
cerebellum  for  their  execution  (McCormick  &  Thompson,  1984; 
Thompson,  1986,  1989;  Bloedel,  Bracha,  Kelly  &  Wu,  1991; 
Greenough  &  Anderson,  1991).  Research  suggests  that  the 
cerebellum  is  indeed  crucially  involved  in  phonetic  speech 
production  (Ivry  &  Keele,  1989;  Raichle,  1990)  and  may  therefore 
be  responsible  for  the  acquisition  and  storage  of  the  procedural 
motor  speech  programs  which  are  responsible  for  modulating  and 
orchestrating  motor  speech  activities.  In  fact,  it  has  been  suggested 
that  the  evolutionary  development  of  phonetic  speech  ability  (an 
ability  which  depends  upon  high-speed  processing  and  the 
integration  of  mental  and  motor  activity)  in  humans  was  largely  the 
result  of  the  phylogenetic  enlargement  of  regions  of  the 
neocerebellum  (Leiner,  Leiner  &  Dow,  1987:  429). 
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Figure  1.  Sketch  of  lateral  view  of  human  brain  identifying  numerous 
functional  and  anatomical  regions.  A=Broca's  Area  44  &  45, 
B=Frontal  Lobe,  C=Area  8,  Prefrontal  Cortex,  D=Area  6, 
Supplementary  Motor  Cortex,  E=Area  4,  Primary  Motor  Cortex, 
F=Primary  Sensory  Cortex,  G=Secondary  Sensory  Cortex, 
H=Parietal  Lobe,  I=Occipital  Lobe,  J=Cerebellum,  K=Brain  Stem, 
L= Wernicke's  Area  22  &  42,  M=Temporal  Lobe. 


General   (Non-Linguistic)  Cognitive  Component 

Certain  aspects  of  the  acquisition  and  production  of  a  second 
language  are  assumed  to  be  extra-linguistic,  involving  general 
cognitive  capacities  such  as  reasoning,  the  development  of  plans  for 
future  actions,  and  the  strategic  direction  and  integration  of  available 
resources  towards  a  specified  goal.  The  prefrontal  region  of  the 
cerebral  cortex  (cf.  Figure  1)  is  generally  acknowledged  to  be 
responsible  for  the  cognitive  functions  of  abstract  reasoning, 
weighing  the  consequences  of  future  actions,  and  planning 
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accordingly  (Fuster,  1988,  1992;  Kupferman,  1991).  It  is  likely 
that  the  prefrontal  cortex  is  also  responsible  for  the  acquisition  of 
these  same  skills  as  they  relate  to  the  use  of  a  second  language. 
Research  supports  the  notion  that  the  prefrontal  cortex  is  responsible 
for  the  reasoning  and  planning  activities  required  for  the  utilization 
of  language  to  conceptualize,  elaborate,  and  express  our  thoughts 
(Novoa  &  Ardila,  1987).  After  reviewing  the  substantial  clinical 
and  experimental  research,  Stuss  and  Benson  (1984)  conclude  that 
frontal  lobe  lesion  studies  support  a  role  for  the  frontal  lobe  in 
organization  and  sequential  planning,  monitoring  of  behavior, 
directed  attention,  and  error  detection.  Numerous  studies  document 
the  following  impairments  in  patients  following  frontal  lesions:  (1) 
an  inability  to  use  verbalized  (i.e.,  declarative)  knowledge  to  guide 
motor  activity,  (2)  an  impaired  ability  to  organize  sequential 
behaviors,  (3)  an  impaired  capacity  to  direct  and  maintain  attention, 
and  (4)  an  impaired  ability  to  monitor  on-going  activity  (Stuss  & 
Benson,  1984:  22-23).  Patients  with  prefrontal  damage  are  typically 
observed  to  display  generally  intact  formal  linguistic  systems  and  yet 
are  significantly  impaired  in  their  ability  to  use  their  linguistic 
resources  strategically  to  accomplish  desired  linguistic  behaviors  or 
achieve  communicative  goals  (Novoa  &  Ardila,  1987).  These 
findings  have  led  researchers  to  conclude  that  the  linguistic 
impairments  observed  in  patients  with  prefrontal  damage  are  not  due 
to  deficits  in  specifically  linguistic  functions,  but  are  instead  the 
result  of  their  generally  impaired  ability  to  exercise  control  over 
behavior,  focus  voluntary  attention  appropriately,  and  develop  plans 
which  direct  their  activities  towards  a  specified  goal  (Novoa  & 
Ardila,  1987:  207). 

Specifically   Linguistic  Components 

The  current  model  fundamentally  assumes  a  distinction 
between  two  types  of  linguistic  knowledge  and  proposes  two 
separate  components  devoted  to  specifically  linguistic  knowledge: 
one  involving  declarative  linguistic  knowledge  and  the  other 
procedural  linguistic  skills.  This  proposed  distinction  between 
declarative  and  procedural  knowledge  originated  within  the  fields  of 
artificial  intelligence  (Winograd,  1975)  and  cognitive  psychology 
(Anderson,  1976)  and  was  recently  applied  to  specifically  Hnguistic 
knowledge  by  a  number  of  researchers  interested  in  SLA 
(Anderson,  1980,  1985;  Bialystok  &  Ryan,  1985;  Faerch  & 
Kasper,  1985;  O'Malley  &  Chamot,  1990).  For  instance,  Faerch 
and  Kasper  (1985)  classified  semantic  knowledge  of  word  meaning 
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and  explicitly  "learned"  rules  of  grammar  as  declarative,  and 
strategies  and  procedures  used  to  implement  declarative  knowledge 
as  procedural  knowledge;  O'Malley  and  Chamot  (1990)  described 
declarative  linguistic  knowledge  as  knowledge  "about"  how  to  use 
language  and  procedural  knowledge  as  the  skills  required  to  actually 
use  language  as  a  communicative  tool.  Unfortunately,  despite 
obvious  implications  for  research  and  theory,  SLA  researchers  have 
generally  been  hesitant  to  pursue  this  distinction,  perhaps  because  of 
the  imprecise  and  confounding  way  in  which  the  terms  "declarative" 
and  "procedural"  are  often  used.  Among  those  who  have  addressed 
the  topic  there  has  been  significant  debate  concerning  the  extent  to 
which  these  two  types  of  knowledge  differ  in  the  nature  of  their 
representation  in  memory,  the  degree  to  which  one  type  of 
knowledge  can  be  transformed  into  the  other  type,  and  even  the 
feasibility  of  accurately  classifying  knowledge  as  being  either 
declarative  or  procedural  (Anderson,  1980,  1985;  Bialystok  & 
Ryan,  1985;  Faerch  &  Kasper,  1985;  O'Malley  &  Chamot,  1990). 
It  is  interesting  to  note  that  the  question  of  whether  and  how 
declarative  knowledge  might  be  transformed  into  procedural 
knowledge  is  reminiscent  of  the  longstanding  debate  in  SLA 
research  concerning  the  possible  facilitatory  role  of  "learned" 
linguistic  knowledge  in  the  subsequent  "acquisition"  of  that 
knowledge  (Lamendella,  1979;  Krashen,  1981;  Gregg,  1984; 
McLaughlin,  1987). 

The  declarative/procedural  distinction  has  also  been  adopted 
by  neurobiological  researchers  and  incorporated  within  their  theories 
of  learning  and  memory  (Cohen  &  Squire,  1980,  1981;  Squire, 
1982,  1985,  1986,  1987;  Tulving,  1987;  Kupfermann,  1991; 
Squire  &  Zola- Morgan,  1991).  Researchers  working  within  this 
neurobiological  paradigm  have  been  able  to  formulate  more  precise 
and  theoretically  constrained  definitions  of  each  type  of  knowledge, 
offer  substantial  clinical  and  experimental  evidence  supporting  the 
validity  of  the  proposed  declarative/procedural  distinction,  and 
clarify  the  possible  facilitative  role  of  declarative  linguistic 
knowledge  in  the  acquisition  of  procedural  linguistic  knowledge. 
Squire  and  Zola-Morgan  (1991),  for  example,  have  developed  a 
taxonomy  of  knowledge  types  which  distinguishes  between 
declarative  (factual  and  episodic)  knowledge  and  non-declarative 
knowledge.  Non-declarative  knowledge  comprises  several  distinct 
sub-types  of  knowledge  including:  procedural  skill  knowledge, 
priming,  simple  classical  conditioning,  and  non-associative 
knowledge.  Only  two  of  these  sub-types  of  knowledge  are  relevant 
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to  the  present  discussion:  declarative  knowledge,  and  procedural 
skill  knowledge. 

For  present  purposes  I  will  assume  that  declarative  linguistic 
knowledge  refers  to  the  speaker/learner's  lexical  and  semantic 
knowledge  and  explicit,  formally  "learned"  knowledge  of  the  rules 
of  the  L2  grammar  (e.g.,  memorized  and  statable  knowledge  of 
grammatical  rules),  while  procedural  linguistic  skills  consist  of  the 
speaker/learner's  implicit  knowledge  of  the  "abstract  rules"  related  to 
the  sequencing,  coordination,  and  combination  of  linguistically 
relevant  units  (phonemes,  morphemes,  words,  phrases,  etc.)  into 
grammatical  configurations  as  required  for  the  actual  use  of  the  L2  in 
real-time  as  a  communicative  tool.  Thus,  a  defining  characteristic 
which  can  help  to  identify  knowledge  as  being  either  declarative  or 
procedural  in  nature  is  the  means  by  which  it  can  be  demonstrated: 
declarative  knowledge  can  be  explicitly  verbalized  and  procedural 
skills  can  be  performed.  However,  this  distinction  does  not 
preclude  the  possibility  of  learners  acquiring  declarative  knowledge 
related  to  the  performance  of  an  essentially  procedural  skill,  perhaps 
even  without  adequately  acquiring  the  procedural  aspects  required 
for  the  performance  of  the  skill.  This  may,  in  fact,  be  precisely 
what  is  happening  to  learners  who  are  able  to  demonstrate  accurate 
grammatical  knowledge  of  the  L2  "declaratively,"  yet  are  unable  to 
use  this  knowledge  "procedurally." 

As  an  additional  example  of  each  type  of  linguistic 
knowledge  consider  what  speaker/learners  of  English  know  about 
the  word  "give."  Declarative  linguistic  knowledge  of  "give" 
includes  the  fact  that  "give"  symbolically  encodes  the  following 
concept:  the  transfer  of  possession  or  ownership  of  some  object  or 
entity  from  one  party  to  another,  as  a  result  of  some  action  of  the 
first  party.  Procedural  linguistic  skills  related  to  "give"  include  the 
speaker/learner's  implicit,  encoded  knowledge  that  "give"  must 
appear  in  syntactic  constructions  as  the  head  of  a  verb  phrase 
containing  two  arguments  (a  direct  and  an  indirect  object),  assigns 
inherent  case,  and  thereby  licenses  dative  alternation  of  its  direct  and 
indirect  objects.^  It  is  worth  emphasizing  that  speaker/learners  may 
possess  significant  amounts  of  procedural  knowledge  related  to  the 
syntactic,  phonological,  and  morphological  properties  of  a  word  and 
yet  be  largely  or  entirely  unable  to  express  this  knowledge  verbally, 
as  in  the  case  of  untutored,  naturalistic  L2  learners  who  have  little  if 
any  declarative  knowledge  of  the  L2  system  beyond  their  semantic 
and  lexical  knowledge. 

The  validity  of  the  distinction  between  declarative  and 
procedural  knowledge  types  rests  primarily  on  the  extensive  clinical 


Procedural  Linguistic  Skill  Acquisition    245 

and  experimental  research  performed  with  human  amnesties  and 
lesion  studies  performed  upon  laboratory  animals.  Patients  with 
lesions  localized  to  the  medial  temporal  lobe  of  the  cerebral  cortex, 
resulting  either  from  surgery  or  injury,  have  been  reported  to 
demonstrate  a  significant  loss  of  prior  declarative  memory 
(retrograde  amnesia)  in  addition  to  a  severely  impaired  capacity  to 
acquire  declarative  knowledge  (anterograde  amnesia)  (Milner,  1966; 
Warrington  &  Weiskrantz,  1982;  Squire,  1986;  Squire  &  Zola- 
Morgan,  1991).  Although  the  extent  to  which  prior  memories  are 
lost  varies  considerably,  most  patients  retain  a  significant  portion  of 
their  remote  memory  (i.e.,  memory  stored  many  years  prior  to 
damage).  The  capacity  of  these  patients  to  acquire  and  retrieve 
procedural  knowledge,  however,  remains  remarkably  intact 
(Warrington  &  Weiskrantz,  1982;  Squire,  1986;  Tulving,  1987). 
Despite  an  inability  to  remember  even  the  simplest  facts,  amnesties 
demonstrate  a  normal  ability  to  acquire  novel,  complex  procedural 
skills  such  as  reverse  mirror  reading  (Cohen  &  Squire,  1980). 
Thus,  the  defining  characteristics  of  human  amnesia,  a  significant 
impairment  of  declarative  memory  in  conjunction  with  spared 
procedural  memory,  support  the  existence  of  separate 
memory/knowledge  systems  that  are  dependent  upon  distinct 
neuroanatomical  structures  for  their  acquisition  and  storage  (Squire, 
1986;  Tulving,  1987;  Kupfermann,  1991;  Squire  &  Zola-Morgan, 
1991). 

In  the  sections  below,  I  will  briefly  present  what  are 
currently  considered  the  most  plausible  neuroanatomical  substrates 
for  each  type  of  linguistic  knowledge.  However,  I  would  like  to 
point  out  that  the  fundamentally  distinct  character  of  these  two  types 
of  knowledge,  declarative  being  a  chunk  of  information  and 
procedural  a  detailed  program  for  activity,  as  well  as  their 
dependence  upon  distinct  neuroanatomical  systems,  make  it  entirely 
inconceivable  that  knowledge  of  one  type  could  ever  be 
"transformed"  into  knowledge  of  the  other  type.  However,  this 
does  not  preclude  the  possibility  that  previously  acquired  knowledge 
of  one  type  may  facilitate  the  subsequent  acquisition  of  related 
knowledge  of  the  other  type,  which  is  essentially  what  I  will 
propose  in  a  later  section  of  this  paper. 

Anatomical  substrate  for  declarative  linguistic 
knowledge/memory 

Recent  research  on  human  amnesties  and  non-human 
primates  provides  compelling  evidence  that  the  medial  temporal  lobe 
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system  (consisting  of  the  hippocampus,  the  parahippocampal  gyrus, 
and  the  entorhinal  and  perirhinal  cortices)  is  primarily  responsible 
for  and  crucially  involved  in  the  acquisition  and  storage  of 
declarative  knowledge,  although  the  actual  site  of  long-term  memory 
storage  most  likely  lies  outside  of  this  region  (Squire  &  Zola- 
Morgan,  1991;  Plummer,  1991). 

The  acquisition  of  declarative  knowledge  (including 
specifically  linguistic  declarative  knowledge)  involves  the  shift  of 
memory  stores  from  short-term,  working  memory  to  long-term 
memory  and  the  subsequent  consolidation  with  previously  acquired 
knowledge  and  transfer  to  a  location  independent  of  the  medial 
temporal  lobe  system.  The  neocortex  is  presumed  to  play  a 
significant,  as  yet  undefined,  role  in  the  transfer  of  declarative 
memory  from  semi-permanent  storage  sites  within  the  medial 
temporal  lobe  system  to  long-term  storage  sites,  which  can  then  be 
accessed  and  retrieved  independently  (Squire  &  Zola-Morgan, 
1991).  The  remote  declarative  memory  spared  in  medial  temporal 
lobe  amnesia  is  presumed  to  be  that  which  has  been  transferred  to 
this  independent  long-term  storage  site  (Squire,  1987). 

Anatomical  substrate  for  procedural  linguistic 
knowledge/memory 

Procedural  knowledge  is  generally  considered  to  involve  an 
aggregate  of  distinct  skills  which  are  acquired  and  stored  in  a 
number  of  distinct  neuroanatomical  systems  (Squire,  1987; 
Harrington,  Haaland,  Yeo  &  Marder,  1990;  Squire,  Zola-Morgan, 
Cave,  Haist,  Musen  &  Suzuki,  1990;  Bloedel,  Bracha,  Kelly  & 
Wu,  1991).  As  discussed  previously,  the  cerebellum  and  the  basal 
ganglia  may  be  responsible  for  the  acquisition  and  orchestration  of 
those  procedural  motor  skills  responsible  for  phonetic  speech 
activity.  Research  involving  the  acquisition  of  the  procedural  skills 
underlying  complex  cognitive  and  linguistic  activities  has 
traditionally  focused  upon  the  cerebral  cortex,  but  researchers  have 
recently  argued  that  the  neocerebellum  may  also  participate  in  the 
modulation,  integration,  and  acquisition  of  these  skills  (Leiner, 
Leiner&  Dow  1986,  1987,  1989,  1991;  Schmahmann,  1991).^ 

The  neocerebellum  forms  a  significant  portion  of  what 
researchers  have  identified  as  an  extensive  and  phylogenetically 
enlarged  "learning  loop"  (Leiner,  Leiner  &  Dow,  1987).  This 
proposed  "learning  loop"  is  essentially  a  circular  neural  circuitry 
system  which  connects  the  newly  evolved  regions  of  a  number  of 
brain  structures  including  the  cerebral  neocortex,  the  neocerebellar 
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nuclei  and  cortex,  and  the  red  nucleus.  The  system  is  assumed  to 
contribute  substantially  to  the  rapid  and  fluent  acquisition  and 
performance  of  procedural  motor,  cognitive,  and  linguistic  skills  in 
humans  (Leiner,  Leiner  &  Dow,  1986,  1987,  1989,  1991).  The 
current  proposal,  which  asserts  that  the  neocerebellum  is  largely 
responsible  for  the  acquisition  of  procedural  linguistic  skills,  relies 
upon  exactly  these  extensive  connections  with  newly  evolved 
regions  of  the  human  brain  and  areas  of  the  cerebral  cortex 
traditionally  associated  with  language  function,  in  addition  to  the 
cerebellum's  enormous  computational  capacity.  The  proposed 
involvement  of  the  neocerebellum  and  related  neural  circuitry  in  the 
acquisition  of  procedural  linguistic  skills  will  be  discussed  in  detail 
in  the  section  which  follows. 


PROPOSED  MODEL  OF  PROCEDURAL 
LINGUISTIC  SKILL  ACQUISITION 


The  model  of  procedural  linguistic  skill  learning  presented 
here  is  based  upon  proposals  made  by  Thompson  (1984,  1986, 
1989),  Bloedel,  Bracha,  Kelly  &  Wu  (1991),  Ghez  (1991),  and 
Greenough  &  Anderson  (1991)  concerning  cerebellar  involvement 
in  voluntary  movement  and  motor  skill  learning.  Current  models  of 
voluntary  movement  propose  that  the  cerebellum  is  responsible  for 
integrating  the  sensory  input  from  the  environment  with  the 
conceptual  motor  activity  plans  of  the  prefrontal  association  cortex 
and  ultimately  producing  detailed,  precisely  timed  and  coordinated 
programs  for  the  execution  of  motor  activity  which  are  then  relayed 
to  the  relevant  musculature  by  way  of  the  motor  neurons  of  the 
cerebral  cortex  (Ghez,  1991).  Essentially,  the  model  below 
proposes  a  similar  involvement  for  the  neocerebellum  in  the 
formulation  and  acquisition  of  (non-motor)  linguistic  programs. 
Although  the  cerebellum  has  not  traditionally  been  assumed  to  play  a 
significant  role  in  the  acquisition  or  production  of  language  (aside 
from  its  stricdy  motor  involvement  in  phonetic  speech  production),  I 
suggest  that  it  is  in  fact  uniquely  suited  for  its  proposed  role  in  the 
acquisition  of  procedural  linguistic  skills  for  several  reasons.  First, 
the  cerebellar  cortex  contains  an  enormous  number  of  neurons, 
similar  to  that  of  the  cerebral  cortex,  enabling  it  to  perform  large 
quantities  of  precise  computations  quickly  and  accurately  (Leiner, 
Leiner  &  Dow,   1987:  434).     Second,  because  of  the  highly 
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Structured,  parallel  organization  of  its  dendritic  and  axonal  networks 
which  are  easily  and  continuously  modified  by  experience,  the 
cerebellar  cortex  is  capable  of  the  immense  quantities  of  high-speed 
parallel  processing  that  are  required  for  the  acquisition  and 
production  of  language.^  Finally,  because  of  its  ability  to  integrate 
incoming  ascending  sensory  and  descending  cortical  information 
with  ongoing  motor  activities  and  its  extensive  connections  with 
areas  of  the  cerebral  cortex  implicated  in  language  function  (e.g. 
frontal  and  parietal  association  areas,  prefrontal  cortex,  Broca's 
area,  Wernicke's  area,  and  motor  speech  cortex)  the  neocerebellum 
appears  ideally  suited  to  the  task  of  orchestrating  ongoing  linguistic 
activity  while  acquiring  or  fine  tuning  novel  procedural  linguistic 
skills  (Leiner,  Leiner  &  Dow;  1986,  1987,  1989,  1991; 
Schmahmann,  1991). 

Because  the  organization  of  information  flowing  into  and  out 
of  the  cerebellum  is  of  central  importance  in  understanding  precisely 
how  it  might  initially  acquire  and  subsequently  improve  and  refine 
performance  of  novel  linguistic  skills,  I  briefly  summarize  the 
relevant  aspects  of  the  neuroanatomical  (structural)  and 
neurophysiological  (functional)  organization  of  the  neocerebellum 
below. 

Anatomy  and  Physiology  of  the  Neocerebellum 

The  cerebellum  is  located  posterior  to  the  pons  and  medulla 
and  inferior  to  the  cerebral  hemispheres  (cf..  Figure  1).  As  a  whole 
the  cerebellum  is  responsible  for  the  maintenance  of  equilibrium  and 
balance,  posture  and  muscle  tone,  and  the  initiation,  coordination, 
and  modulation  of  motor  activities;  responsibility  for  initiation  and 
temporal  coordination  is  presumably  shared  with  the  structures  of 
the  basal  ganglia.  The  cerebellum  is  organized  such  that  information 
from  distinct  functional  systems  (e.g.,  those  devoted  to  equilibrium, 
posture,  coordination)  is  directed  towards  different  cerebellar  nuclei 
and  different  regions  of  the  cerebellar  cortex  for  processing. 
Processed  information  and  stimuli  are  then  conveyed  to  related 
functional  anatomical  systems  in  other  areas  of  the  nervous  system. 
In  this  manner  functional  systems  are  localized  within  specific 
regions  of  anatomical  structures  and  yet  distributed  among  networks 
of  connections  which  span  several  anatomical  systems. 

The  following  description  of  the  flow  of  information  into 
and  out  of  the  cerebellum  (Figure  2)  is  true  of  both  the  cerebellum  as 
a  whole  and  the  region  of  primary  concern  to  the  current  discussion: 
the  neocerebellum. 
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Figure  2.  Highly  schematized  illustration  of  the  flow  of  incoming  and  outgoing 
information  in  the  cerebellum.  Plus  signs  (+)  represent  excitatory 
connections,  minus  signs  (-)  inhibitory  connections. 

Information  enters  the  cerebellum  by  way  of  large  bundles  of 
fibers  (cerebellar  peduncles)  which  ultimately  terminate  in 
predetermined  regions  of  the  cortex.  En  route  to  the  cortex  these 
afferent  (incoming)  fibers  send  branching  (secondary)  collateral 
fibers  to  the  relevant  cerebellar  nucleus.  Thus,  the  cerebellar  nuclei 
and  the  cerebellar  cortex  are  essentially  responding  to  the  same 
incoming  stimuli  but  in  a  distinct  manner.  The  efferent  (outgoing) 
fibers  of  the  cerebellar  nuclei  are  excitatory  in  nature  (i.e.,  they  carry 
signals  which  facilitate  activity),  whereas  the  efferent  fibers  of  the 
cerebellar  cortex  are  inhibitory  (i.e.,  they  carry  signals  which  inhibit 
or  suppress  activity).  The  efferent  fibers  from  the  cerebellar  cortex 
all  terminate  in  the  cerebellar  nuclei,  suppressing  the  firing  activities 
of  those  nuclei  and  indirectly  suppressing  motor  activity.  The 
efferent  fibers  from  the  cerebellar  nuclei  are  the  only  fibers  which 
actually  leave  the  cerebellum  and  they  fire  under  the  modulating 
influence  of  the  cerebellar  cortical  neurons.  The  fibers  from  the 
cerebellar  nuclei  ascend  through  the  thalamus  to  terminate  in 
predetermined  regions  of  the  cerebral  motor  cortex  (Brodmann's 
areas  4  &  6)  and  convey  impulses  which  stimulate  the  firing  activity 
of  the  cortical  motor  neurons.  The  motor  neurons  of  the  cerebral 
cortex  are  responsible  for  conveying  the  excitatory  stimuli  to  the 
musculature  which  result  in  motor  activity.  Thus,  although  the 
excitatory  stimuli  which  ultimately  result  in  motor  activity  originate 


250    Robbins 

in  the  cerebellar  nuclei,  these  impulses  can  be  directly  suppressed  by 
the  inhibitory  impulses  of  the  cells  of  the  cerebellar  cortex. 

aff events  to  the  neo cerebellum:  Two  important  sources  of 
afferent  (incoming)  fibers  to  the  neocerebellum  are  the  fibers 
descending  from  the  cerebral  cortex  (via  the  corticopontocerebellar 
tract)  and  the  fibers  originating  in  the  olivary  nucleus.  These  tracts 
connect  the  neocerebellum  with  several  cortical  structures  implicated 
in  language  function  including:  the  frontal  cortical  association  areas 
(abstract  reasoning  and  planning),  prefrontal  cortex  (planning  and 
monitoring),  Broca's  expressive  speech  area  (areas  44  &  45),  motor 
speech  cortex  (areas  4  &  6)  and  Wernicke's  receptive  speech  area 
(areas  22  &  42).  The  corticopontocerebellar  tract  consists  of 
efferent  fibers  originating  in  each  of  the  four  lobes  of  the  cerebral 
cortex  which  descend  through  the  pontine  nuclei  and  decussate  (i.e., 
cross)  to  enter  the  contralateral  cerebellar  hemisphere.  In  this 
manner  the  right  cerebellar  hemisphere  receives  information 
regarding  the  activities  of  the  right  side  of  the  body  from  the  left 
cerebral  hemisphere,  and  the  left  cerebellar  hemisphere  from  the 
right  cerebral  cortex.  The  climbing  fibers  from  the  olivary  nucleus 
constitute  a  second  source  of  afferent  fibers  to  the  neocerebellum. 
The  significance  of  these  fibers  is  twofold.  First,  the  olivary 
nucleus  receives  the  majority  of  the  efferent  fibers  from  the  red 
nucleus  which  is  a  brainstem  structure  receiving  massive  projections 
from  the  neocortex.  Second,  the  cells  of  the  red  nucleus  are 
reported  to  exhibit  remarkable  plasticity  (Tsukahara,  1984).  As 
previously  mentioned,  researchers  have  speculated  that  the  red 
nucleus  may  play  a  considerable  role  in  a  newly  evolved  and 
enlarged  learning  loop  within  the  human  brain  (Leiner,  Leiner  & 
Dow,  1986).  A  third  important  source  of  afferent  fibers  are  those  of 
the  ascending  spinal  tracts.  These  fibers  convey  sensory  and 
proprioceptive  (related  to  position  and  movement  of  muscles) 
information  from  the  external  environment  and  the  musculature 
directly  to  the  cerebellum.  It  is  this  continuously  up-dated 
information  concerning  the  changing  environment  and  ongoing 
motor  activity  which  enables  the  cerebellum  to  effectively  monitor 
and  orchestrate  smooth  and  balanced  movement. 

efferents  from  the  neocerebellum:  As  discussed  above, 
efferent  fibers  from  the  neocerebellar  cortex  terminate  exclusively  in 
the  lateral  dentate  nucleus,  which  in  turn  serves  as  the  unique  source 
of  efferent  fibers  leaving  the  neocerebellum.  A  small  percentage  of 
efferent  fibers  from  the  dentate  nuclei  return  to  the  cerebellar  cortex 
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forming  a  circular  feedback  loop  crucial  to  the  cerebellum's  function 
in  continuously  monitoring  and  modifying  ongoing  activity. 
However,  the  majority  of  the  efferent  fibers  from  the  dentate  nucleus 
leave  the  cerebellum  and  project  to  the  thalamus  (ventral  lateral  and 
ventral  anterior  nuclei).  The  efferent  fibers  from  these  thalamic 
nuclei  subsequently  project  to  diverse  regions  of  the  cerebral  cortex 
including:  the  frontal  motor  areas  of  the  cerebral  cortex  (areas  4  & 
6),  the  prefrontal  cortex  (area  8),  the  frontal  association  cortex, 
Broca's  expressive  speech  area  (areas  44  &  45),  and  Wernicke's 
receptive  speech  area  (areas  22  &  42)  (cf.  Figure  1). 

In  summary,  the  neocerebellar  cortex  and  the  dentate  nuclei 
receive  identical  stimuli  from  the  cerebral  cortex  and  the 
environment,  enabling  the  dentate  nuclei  to  respond  direcdy  to 
sensory,  proprioceptive,  and  cerebrocortical  stimuli  while 
simultaneously  being  monitored  and  influenced  (i.e.,  inhibited)  by 
the  neurons  of  the  neocerebellar  cortex.  The  information  conveyed 
to  the  neocerebellum  from  the  cerebral  cortex  includes  stimuli  from 
several  areas  with  identified  language  related  functions  including  the 
frontal  cortical  association  areas  involved  in  abstract  reasoning  and 
planning,  the  prefrontal  cortex  responsible  for  planning  and 
monitoring,  Broca's  expressive  speech  area,  the  motor  speech 
cortex,  and  Wernicke's  receptive  speech  area.  Given  its  extensive, 
highly  organized  and  neuron-dense  networks,  the  neocerebellar 
cortex  is  able  to  process  the  incoming  information  at  high  speeds.  It 
then  performs  the  computations  required  to  integrate  input  from  the 
cerebrocortical,  sensory,  and  motor  systems  and  produce  a  detailed 
program  for  the  execution  of  the  desired  linguistic  behavior.  On  the 
basis  of  this  program,  the  neocerebellar  cortex  is  then  able  to 
monitor,  in  an  "informed"  manner,  the  firing  of  the  dentate  nuclei 
which  ultimately  (by  way  of  the  thalamus)  convey  excitatory  stimuli 
to  the  motor  neurons  of  the  cerebral  cortex  resulting  in  the  motor 
activity  of  the  speech  organs.  Thus,  although  the  signals  conveyed 
to  the  cortical  motor  neurons  originate  in  the  cells  of  the  dentate 
nuclei,  the  firing  of  these  cells  is  crucially  guided  by  the  activity  of 
the  neurons  of  the  cerebellar  cortex,  which  are  thereby  capable  of 
determining  the  outcome  of  motor  speech  activity  by  suppressing 
(inhibiting)  undesired  behaviors  and  allowing  (disinhibiting)  desired 
behaviors. 

Model  of  Procedural  Linguistic  Skill  Acquisition 

The  model  presented  here  proposes  that  procedural  linguistic 
skill  acquisition  occurs  essentially  as  follows.   To  begin,  general 


252    Robbins 

linguistic  plans  or  behavior  structures  regarding  desired  future 
linguistic  activity  are  sent  to  the  cerebellum  from  the  expressive 
speech  area  of  the  cerebral  cortex  (Broca's  area)  and,  presumably 
under  the  guidance  and  monitoring  influence  of  the  prefrontal 
cortex,  they  are  operationalized  within  the  neocerebellar  cortical 
networks  and  the  related  neural  circuitry  (Figure  3). 9  The 
operationalization  of  these  general,  conceptual  linguistic  plans  or 
behavior  structures  involves  extensive  and  high-speed  parallel 
processing  and  the  integration  of  motor  and  cognitive  activities,  and 
results  in  the  production  of  detailed,  precisely  timed  programs  for 
the  execution  of  linguistic  activity.  The  newly  created  programs  are 
then  relayed  back  to  the  expressive  speech  region  of  the  cerebral 
cortex  and  can  ultimately  be  used  to  orchestrate  and  oversee  motor 
speech  activity.  During  the  acquisition  process  on-going  linguistic 
activity  is  monitored  and  evaluated  by  the  neocerebellum  and  the 
prefrontal  cortex,  and  information  concerning  the  relative  success  or 
failure  of  the  performance  of  the  novel  skill  is  used  to  create 
appropriate  cognitive  and  linguistic  plans  for  future  action  and  to 
help  direct  focused  attention  to  aspects  of  the  developing  program 
which  require  improvement  and  fine-tuning.  Subsequent 
performances  of  the  novel  skill  result  in  gradual  and  incremental 
long-term  gains  in  the  accuracy  of  the  cerebellar  program  and  the 
speed  with  which  it  is  executed. 

In  such  a  model,  the  role  of  the  prefrontal  cortex  would  be 
most  essential  during  the  early  stages  of  skill  acquisition  when  the 
novel  linguistic  plans  fare  first  conveyed  and  refined  into  detailed, 
precisely  timed  cerebellar  programs  for  linguistic  activity.  In  fact, 
this  assumption  is  generally  in  accordance  with  what  is  known  about 
the  involvement  of  the  prefrontal  cortex  during  the  acquisition  of 
novel  skills.  Fuster  (1992)  reviews  a  large  body  of  clinical  and 
experimental  evidence  which  suggests  that  "the  prefrontal  cortex  is 
essentially  involved  in  the  formation  of  behavior  structures"  and  of 
crucial  importance  when  those  behavior  structures  are  either  "novel 
to  the  organism  or  unusually  complex  in  their  sensory  or  motor 
aspects"  (Fuster,  1988,  1992:  352).  Research  also  suggests  that 
frontal  cortical  planning  activities  may  be  responsible  for  ensuring 
that  attention  is  directed  towards  selected  aspects  of  the  incoming 
linguistic  and  sensory  information  (Lem,  this  volume)  enabling 
input  to  be  transformed  into  intake  (Sato  &  Jacobs,  this  volume), 
and  may  result  in  what  is  experienced  by  the  learner  as 
consciousness  (cf.  Bridgeman,  1992).  According  to  the  proposed 
model,    once  the  neocerebellum  has  successfully  acquired    (i.e., 
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Figure  3.  Schematic  representation  of  the  proposed  model  of  procedural 
linguistic  skill  acquisition  illustrating  directional  flow  of 
information  and  anatomical  connections  between  relevant  brain 
regions. 

operationalized  and  refined)  the  detailed  procedural  activity 
programs  required  for  the  execution  of  a  given  cerebral  linguistic 
plan,  the  monitoring  activities  of  the  prefrontal  cortex  are  no  longer 
required  to  ensure  accurate  performance  and  cease  to  be  routinely 
involved  in  the  performance  of  the  procedural  linguistic  skills. 
Decreasing  involvement  of  frontal  cortical  planning  activities  is 
experienced  as  a  gradual  fading  of  conscious  awareness  of  the  task 
and  decreasing  need  for  focused  selective  attention  during 
procedural  skill  execution,  and  it  effectively  allows  prefrontal 
planning  capacities  to  be  devoted  to  higher-order  communicative 
goals.  10  This  decreasing  need  for  the  involvement  of  prefrontal 
planning  and  monitoring  activities  as  once-novel  skills  are  gradually 
acquired  may  be  considered  a  neurobiological  correlate  of  what  SLA 
researchers  have  referred  to  as  automaticization.  A  number  of  SLA 
researchers  have  argued  that  the  successful  acquisition  of  a  second 
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language,  or  any  other  complex  cognitive  skill,  requires  the  gradual 
integration  of  the  individual  sub-components  of  a  complex  skill 
which  occurs  over  time  as  processes  that  were  initially  "controlled" 
(i.e.,  highly  cognitively  demanding)  become  "automatized"  (i.e., 
less  cognitively  demanding)  (Faerch  &  Kasper,  1985;  McLaughlin, 
1987).  Attempting  to  apply  the  current  neurobiological  framework 
to  cognitive  theories  of  acquisition,  "controlled"  processes  can  be 
equated  with  procedural  linguistic  skills  in  the  process  of  being 
acquired,  still  requiring  frontal  cortical  monitoring  to  ensure  accurate 
performance,  and  "automatic"  processes  with  procedural  skills 
which  have  been  fully  acquired  (i.e.,  operationalized  and  refined) 
and  are  executed  within  the  circuitry  of  the  cerebellum  and  related 
neural  systems  without  prefrontal  involvement.  Inconsistencies 
observed  in  the  performance  of  skills  being  acquired  are  therefore 
considered  a  result  of  the  limited  capacities  of  the  prefrontal  region 
of  the  cerebral  cortex,  which  can  only  plan  for  and  direct  focused 
selective  attention  to  a  limited  number  of  tasks  at  one  time. 

Stages  in  the  Acquisition  of  a  Procedural  Skill 

The  neurobiological  process  of  acquiring  a  procedural 
linguistic  skill  is  a  gradual  and  incremental  one  which,  according  to 
the  current  model,  involves  three  stages:  formulation,  evaluation, 
refinement  (Table  1). 

Formulation:  As  suggested  by  this  model,  the  initial  acquisition 
stage  begins  when  the  general  cognitive  and  conceptual  linguistic 
plans  that  serve  as  a  model  of  desired  activity  or  behavior  is  first 
conveyed  to  the  cerebellum  from  regions  of  the  prefrontal  and 
frontal  cortex  and  Broca's  expressive  speech  area.  This  stage 
involves  the  integration  of  desired  linguistic  behaviors  with 
incoming  sensory  information  concerning  the  external  environment 
and  on-going  activity.  During  this  initial  formulation  stage,  the 
dense  networks  of  the  cerebellar  cortex  and  the  circuitry  involving 
related  neural  systems  are  assumed  to  perform  the  computations 
necessary  to  operationalize  general  cortical  linguistic  plans  and  to 
produce  a  detailed,  precisely  timed  program  for  the  execution  of 
linguistic  activity. 

Evaluation:  During  the  evaluation  stage  the  developing  program  is 
presumably  conveyed  back  to  Broca's  area  and  ultimately  results  in 
motor    speech    output    which  is  closely    monitored    by  both  the 
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Table  1: 
Procedural  Skill  Acquisition  Stages 

Formulation: 

— General  cognitive  and  conceptual  linguistic  plans  for 
future  action  conveyed  to  cerebellum  from  Broca's  area  and 
prefrontal  and  frontal  association  cortex; 

— Integration,  processing  and  computation  of 
information  in  neocerebellar  circuits  and  related  neural 
systems  (e.g.,  basal  ganglia); 

— Formulation  of  detailed  execution  program  for 
linguistic  activity. 

Evaluation: 

— Execution  of  program  with  simultaneous  monitoring  and 
evaluation  of  actual  performance,  as  compared  with  desired 
behavior  within  the  cerebellar  circuitry  and  the  prefrontal  cortex; 

— Using  information  concerning  relative  success  or  failure 
in  execution,  the  prefrontal  cortex  responds  with  new  plans  for 
future  activity  designed  to  improve  performance  and  inform 
direction  of  selective  attention  to  relevant  linguistic  features  of 
input/output. 

Refinement: 

— Subsequent  performances  of  skill  benefit  from  updated 
prefrontal  planning  and  enhanced  perception  of  relevant 
linguistic  features  involved  in  execution; 

— Detailed  program  gradually  becomes  more  accurate  and 
coordinated; 

— Requirements  of  cortical  planning  and  selective  attention 
to  the  task  decrease,  and  frontal  association  cortex  and 
attentional  systems  gradually  cease  involvement  during 
the  execution  of  the  skill. 


cerebellum  and  the  prefrontal  cortex.  The  continuous  monitoring  of 
the  on-going  activity  and  comparison  with  the  desired  behaviors 
which  occurs  within  the  cerebellum  and  the  prefrontal  cortex  result 
in  an  evaluation  of  the  relative  success  of  initial  attempts  to  perform 
the  novel  skill.  Based  upon  this  evaluation,  the  prefrontal  cortex  can 
then  respond  by  generating  appropriate  plans  for  future  action 
designed  to  improve  subsequent  performance  and  direct  attentional 
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systems  to  focus  selected  attention  on  relevant  features  of  the 
linguistic  input/output. 

The  evaluation  and  ensuing  plans  may  be  communicated  to 
and  stored  within  the  medial  temporal  lobe  system,  resulting  in  a 
declarative  memory  of  the  learning  experience  and/or  explicit 
knowledge  of  one's  strengths  and  weaknesses  with  respect  to  the 
skill  being  acquired  (Kleiter  &  Schwarzenbacher,  1989).  This 
explicit  declarative  knowledge  related  to  the  acquisition  process  may 
prove  to  be  invaluable  to  the  speaker/learner's  effective  use  of 
cognitive  and  metacognitive  learning  strategies  and  may  serve  as  the 
primary  basis  upon  which  they  judge  their  relative  success  or  failure 
as  learners.  11 

Refinement:  Finally,  during  the  refinement  stage,  the  updated 
cognitive  and  linguistic  cortical  plans  can  serve  to  facilitate  the 
subsequent  refinement  and  debugging  of  the  execution  programs  of 
the  cerebellum.  The  gradual  improvement  and  fine-tuning  of  the 
cerebellar  execution  programs  during  the  refinement  stage  of 
acquisition  can  then  result  in  long-term  gains  in  the  speed, 
consistency,  and  accuracy  of  the  performance  of  the  newly  acquired 
procedural  skill.  The  gradually  increasing  accuracy  and  ease  with 
which  the  skill  is  executed  by  the  neocerebellar  circuitry  accounts  for 
the  decreasing  involvement  of  prefrontal  cortical  monitoring  activity 
results  in  a  gradually  fading  conscious  awareness  of,  and  focussed 
attention  during,  performance  of  the  procedural  linguistic  skill. 


SUPPORT  FOR  THE  PROPOSED  MODEL 


PET  Studies 

Research  conducted  with  positron  emission  tomography 
(PET)  imaging  technology  has  attempted  to  identify  the  neural 
structures  activated  during  the  performance  of  a  limited  subset  of 
linguistic  functions:  semantic  word  association  tasks  (Raichle, 
1990;  Peterson,  Fox,  Posner,  Mintun  &  Raichle,  1989).  By  asking 
subjects  to  (1)  look  at  or  listen  to  a  word,  (2)  say  that  word  aloud, 
and  (3)  provide  another  word  which  was  semantically  associated  in 
a  predetermined  manner  to  the  original  word,  researchers  were  able 
to  identify  the  neuroanatomical  systems  actively  involved  in  the 
linguistic  task  of  semantic  association.  In  addition,  by  tracking  the 
brain  activity  of  subjects  over  time  they  were  able  to  distinguish 
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between  those  structures  presumably  responsible  for  performance  of 
the  linguistic  skill  of  word  association  and  those  involved  only  in  the 
acquisition  of  the  skill.  The  results  of  these  PET  studies  confirm  a 
role  for  the  neocerebellum  in  procedural  linguistic  activity  which  is 
distinct  from  its  traditionally  assigned  motor  speech  role.  The  right 
lateral  portion  of  the  neocerebellum  was  active  during  both  the  initial 
acquisition  and  the  subsequent  performance  of  the  word  association 
task.  Significandy,  the  right  hemicerebellum  communicates  with  the 
left  (language  dominant)  cerebral  hemisphere,  further  suggesting 
that  the  neocerebellum  is  actively  involved  in  specifically  linguistic 
functions.  Regions  of  the  frontal  cerebral  cortex  and  the  anterior 
cingulate  gyrus  (implicated  in  focused  selective  attention)  were  also 
actively  involved  during  the  acquisition  phase  but,  significantly, 
were  not  involved  in  later  performance  of  the  task.  These  findings, 
although  far  from  conclusive,  are  consistent  with  the  current 
proposal  that  the  planning  and  monitoring  activities  of  the  frontal 
association  cortex  and  the  selective  attentional  capacity  of  the 
anterior  cingulate  gyrus  are  required  only  during  the  acquisition 
process  when  the  task  is  novel  and  requires  focussed  selective 
attention,  whereas  the  activity  of  the  neocerebellum  is  involved 
throughout  the  acquisition  process  and  required  for  execution  even 
after  the  skill  has  been  successfully  acquired. 

William's   Syndrome 

Research  conducted  on  patients  with  William's  Syndrome 
provides  additional  support  for  the  proposed  role  of  the 
neocerebellum  in  the  acquisition  and  execution  of  procedural 
linguistic  skills  (Bellugi,  Bihrle,  Jemigan,  Trauner  &  Doherty, 
1990).  William's  Syndrome  (WS)  is  a  rare  neurological  disorder 
which  is  characterized  by  a  marked  reduction  in  cerebral  volume 
(80%  of  normal  size)  with  no  significant  reduction  in  cerebellar 
volume  (99%  of  normal).  When  compared  with  age  and  IQ  matched 
Down's  Syndrome  (DS)  subjects,  WS  subjects  performed 
significantly  better  on  linguistic  tasks,  demonstrating  remarkably 
preserved  syntactic  abilities  (Bellugi,  Bihrle,  Jernigan,  Trauner  & 
Doherty,  1990:  1 17).  These  results  are  perhaps  even  more  striking 
in  light  of  the  fact  that  DS  subjects  show  significant  reduction  in 
both  cerebral  (77%  of  normal)  and  cerebellar  (69%  of  normal) 
volume.  Researchers  have  speculated  that  the  remarkable 
preservation  of  formal  linguistic  abilities  in  WS  subjects  in  contrast 
to  the  general  retardation  of  their  other  cognitive  capacities  may  be  a 
reflection  of  the  concurrent  reduction  in  cerebral  volume  and  relative 
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preservation  of  cerebellar  volume,  assuming  of  course  that  the 
cerebellum  plays  a  significant  role  in  linguistic  function  (Bellugi, 
Bihrle,  Jernigan,  Trauner  &  Doherty,  1990;  Leiner,  Leiner  &  Dow, 
1991). 12 

Deficits  Associated  with  Cerebellar  Lesions 

Evidence  is  accumulating  from  lesion  studies  that  damage 
restricted  to  some  regions  of  the  cerebellum  does  not  result  in  the 
typically  observed  motor  deficits,  but  instead  in  significant  cognitive 
impairment  (Leiner,  Leiner  &  Dow,  1987,  1989,  1991).  In  one 
study  five  patients  with  cerebellar  damage  were  reported  to 
demonstrate  substantially  impaired  performance  on  almost  all  tests 
administered  when  compared  with  ten  control  subjects  (Bracke- 
Tolkmitt  et.  ai,  1989).  Significantly,  subjects  with  cerebellar 
damage  were  impaired  on  all  measures  of  IQ,  including  both  verbal 
IQ  and  general  ability  (Bracke-Tolkmitt  et.  ai,  1989:  443).  Thus 
far,  there  have  been  no  reports  of  significant,  specifically  linguistic 
(non-motor)  impairments  as  a  result  of  cerebellar  damage. 
However,  given  what  is  known  of  the  diffuse  and  distributed  nature 
of  memory  representation  in  the  cerebellar  circuitry,  generalized 
impairment  of  formal  linguistic  capacities  would  require  rather 
extensive  cerebellar  lesions  and  would  most  likely  result  in  damage 
to  motor  and  general  cognitive  capacities  as  well. 


IMPLICATIONS  OF  THE  PROPOSED  MODEL: 

DECLARATIVE  "LEARNING"  AS  A  METACOGNITIVE 

STRATEGY  FOR  PROCEDURAL  LINGUISTIC 

SKILL  ACQUISITION 


SLA  researchers  have,  for  many  years,  debated  the 
questions  of  whether,  to  what  extent,  and  in  precisely  what  manner 
language  teachers  ought  to  incorporate  formal,  explicit  grammar 
instruction  into  their  ESL  curriculum  (see  Celce-Murcia,  1992  and 
Krashen,  1992  for  current  perspectives  on  this  debate).  The  current 
proposal  can  contribute  to  a  future  resolution  by  offering  a  new 
conceptualization  of  the  problem,  in  addition  to  a  potential,  if  only 
partial,  solution.  Although,  as  Celce-Murcia  (1992)  asserts, 
formalized  grammar  instruction  is  probably  essential  if  post- 
pubescent  adolescents  and  adults  are  to  ever  acquire  near-native 
linguistic  competence  in  an  L2  and  must  be  embedded  within  the 
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meaningful  and  contextualized  use  of  language,  there  remains 
significant  skepticism  regarding  the  limited  usefulness  of  such 
instruction. 

Krashen  (1992),  for  instance,  concludes  that  because  of  the 
fundamental  distinction  between  conscious  "learning"  and 
unconscious  "acquisition"  processes,  the  effects  of  formal  grammar 
instruction  are  unavoidably  destined  to  be  "peripheral  and  fragile". 
However,  McLaughlin  (1990)  contends  that  the  notion  of 
consciousness  is  entirely  too  vague  to  be  of  much  use  in  theories  of 
SLA  and  should  be  avoided  in  favor  of  more  strictly  defined 
concepts  such  as  automatic  and  controlled  processes  and 
restructuring. 

I  propose  that  considering  these  facts  from  a  neurobiological 
perspective  might  provide  both  new  insight  into  the  nature  of  the 
problem  and  a  potential  conceptual  framework  within  which  to 
develop  a  solution.  Essentially,  I  suggest  that  traditional  approaches 
to  formalized  grammar  instruction  may  have  proved  of  limited 
usefulness  because  they  generally  resulted  in  students  acquiring 
declarative  knowledge  related  to  procedural  linguistic  skills  rather 
than  the  procedural  linguistic  skills  themselves.  Although  explicit 
declarative  knowledge  of  the  L2  linguistic  system  may  be  useful 
when  taking  a  written  exam  or  consulting  a  pedagogical  grammar 
text,  this  type  of  knowledge  is  an  insufficient  basis  for  the  fluent  and 
spontaneous  use  of  the  L2  for  communicative  purposes.  What  has 
yet  to  be  determined  is  how  formalized  grammar  instruction  can  be 
effectively  incorporated  into  ESL  curriculum  so  as  to  facilitate 
learner's  acquisition  of  procedural  linguistic  skills. 

Although  the  neurobiological  theories  of  learning  and 
memory  presented  above  support  the  claim  that  declarative 
knowledge  (i.e.,  "learning")  cannot  be  direcdy  transformed  into 
procedural  skills  (i.e.,  "acquisition"),  they  do  not  preclude  the 
possibility  that  the  prior  acquisition  of  related  declarative  knowledge 
may  under  restricted  circumstances  serve  as  an  effective 
metacognitive  strategy  to  enhance  the  subsequent  acquisition  of 
procedural  linguistic  skills  (cf.  O'Malley  &  Chamot,  1991).  The 
term  metacognitive  strategy  is  intended  to  refer  to  a  strategy  for 
learning  which  involves  conscious,  implicit  consideration  and 
planning  related  to  the  learning  process  itself.  The  above  scenario  is 
possible  only  if  the  declarative  knowledge  related  to  the  novel 
procedural  linguistic  skill  is  used  to  inform  and  improve  the 
planning  and  monitoring  activities  of  the  prefrontal  cortex  and  to 
enhance  the  direction  of  focussed  selected  attention  to  the  relevant 
features  of  the  linguistic  system  in  both  the  environmental  input  and 
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the  behavioral  output.  The  contribution  of  explicit  declarative 
knowledge  of  the  L2  linguistic  system  to  the  successful  acquisition 
of  procedural  linguistic  skills  thus  lies  in  its  potential  contribution  to 
the  learner's  use  of  effective  metacognitive  strategies  to  facilitate  the 
learning  process  itself.  This  contribution,  however,  requires  active 
and  informed  involvement  on  the  part  of  both  the  L2  learner  and  the 
language  instructor  in  the  development  and  use  of  cognitive  and 
metacognitive  learning  strategies  as  well  as  a  general  level  of 
awareness  regarding  the  nature  of  learning  itself.  The  intentional 
and  strategic  use  of  declarative  linguistic  knowledge  in  the  process 
of  acquiring  a  second  language  as  an  adult  has  also  been  advocated 
on  independent  grounds  by  other  researchers  including  Celce- 
Murcia  (1992),  Wenden  (1991),  and  Widdowson  (1990). 


CONCLUSION 


In  this  paper,  I  have  presented  a  model  of  procedural  skill 
acquisition  which  crucially  involves  the  circuitry  of  the 
neocerebellum,  Broca's  area,  and  regions  of  the  frontal  and 
prefrontal  cerebral  cortex.  This  model,  although  it  remains  largely 
speculative,  is  a  plausible  neurobiological  account  of  the  acquisition 
of  procedural  linguistic  skills  and  offers  a  potential  means  of 
unifying  competing  cerebral  and  cerebellar  theories  of  language 
function.  In  addition,  the  model  provides  a  conceptual  framework 
for  further  investigation  of  the  potential  facilitatory  contribution  of 
explicit,  formally  learned  declarative  linguistic  knowledge  to  the 
successful  acquisition  of  procedural  linguistic  skills.  However, 
much  remains  to  be  done  in  terms  of  clarifying  and  more  precisely 
characterizing  the  neurobiological  processes  involved,  more 
thoroughly  addressing  the  nature  of  the  linguistic  knowledge  and  its 
representation,  and  incorporating  pragmatic  and  discourse  related 
knowledge. 
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NOTES 

^  I  have  chosen  to  focus  on  adult  SLA,  rather  than  PLA,  to  avoid  the 
confounding  influences  of  neurological  and  cognitive  development. 

2  See  Pinker  &  Prince,  1988  for  additional  discussion  of  the  linguistic 
processes  involved  in  pluralization. 

3  Ojemann  himself  acknowledges  that  subcortical  structures  such  as  the 
thalamus  are  likely  to  be  involved  in  linguistic  function  (Ojemarm,  1991). 

'^  See  Bialystok,  1990  for  discussion  of  the  benefits  of  such  a  compromise. 

5  Brodmaim's  areas  are  a  numerical  representation  of  designated  regions  of 
the  cerebral  cortex  devised  by  Brodmann  (1909). 

^  Dative  alternation  refers  to  the  movement  of  the  indirect  object  (lO)  of  a 
small  class  of  verbs  such  as  "give"  to  a  preposition-less  position  immediately 
following  the  verb  as  an  alternative  to  the  lO  appearing  as  the  object  of  a  preposition 
following  the  direct  object. 

"^  The  neocerebellum  is  the  most  recently  evolved  portion  of  the  cerebellum 
and  comprises  the  posterior  and  lateral  portions  of  the  cerebellar  cortex  and  the 
lateral  dentate  nucleus. 

8  See  Lx)ritz,  1991  for  discussion  of  the  demand  for  parallel,  rather  than 
serial,  processing. 

9  No  assumptions  are  made  concerning  the  origin  of  these  linguistic 
commands.  It  is  conceivable  that  they  are  either  predominantly  learned  (i.e., 
abstracted  from  the  incoming  linguistic  data  perhaps  by  regions  of  frontal 
association  cortex)  or  predominantly  innate  (i.e.,  unlearned  principles  and 
parameters  of  Universal  Grammar).  For  detailed  discussion  of  this  topic  see  Jacobs 
(1988),  Grain  (1991),  and  Jacobs  &  Schumann  (1992). 

10  See  Levelt,  1978,  1989,  for  a  hierarchy  of  communicative  goals. 

11  See  O'Malley  &  Chamot,  1990  for  discussion  of  effective  use  of  learning 
strategies  in  SLA. 

12  An  alternative  explanation  might  be  that  WS  subjects,  unlike  DS 
subjects,  have  an  essentially  intact  perisylvian  cortex  and  thus  display  preserved 
linguistic  abilities. 
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From  Input  to  Intake: 

Towards  a  Brain-Based  Perspective 

of  Selective  Attention 


Edynn  Sato  and  Bob  Jacobs 
University  of  California,  Los  Angeles 

Anatomy  is  destiny. 
(Freud,  attributed) 


From  a  neurobiological  perspective,  the  present  paper  addresses 
(1)  the  input-intake  distinction  commonly  made  in  applied 
linguistics,  and  (2)  the  role  of  selective  attention  in  transforming 
input  to  intake.  Primary  emphasis  is  placed  on  a  neural  structure 
(the  nucleus  reticularis  thalami)  that  appears  to  be  essential  for 
selective  attention.  The  location,  connections,  structure,  and 
physiology  of  the  nucleus  reticularis  thalami  are  examined  to 
illustrate  its  critical  role  in  information  processing.  By  orchestrating 
the  selection  and  enhancement  of  relevant  sensory  input,  the  nucleus 
reticularis  thalami  acts  as  a  "conductor"  of  neural  systems  involved 
in  learning.  It  is  argued  that  investigations  of  brain  structures  such 
as  the  nucleus  reticularis  thalami  provide  a  more  fundamental 
understanding  of  language  acquisition  mechanisms. 


INTRODUCTION 


First  and  second  language  learners  must  interact  with  the 
environment  to  acquire  the  target  language.  Interaction  with  the 
external  milieu  continually  shapes  the  internal  milieu  supporting  the 
mental  systems  of  language,  cognition,  and  social  meanings  (Hatch 
&  Hawkins,  1987).  In  the  neurobiological  perspective  adopted  by 
the  present  paper,  language  is  viewed  as  a  multimodal  sensory 
enhancement  system  (Jacobs,  1988),  that  is,  a  system  that  depends 
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on  the  primary  senses  (i.e.,  audition  and  vision)  for  the  linguistic 
and  contextual  information  they  bring  into  the  brain,  where  meaning 
is  derived  by  comparing  incoming  sensory  information  with  extant 
neural  structures  formed  by  experience.  Learning,  including 
language  learning,  thus  involves  the  experience-dependent 
generation  or  modification  of  enduring  internal  representations 
(Dudai,  1989;  cf.  Jacobs  &  Schumann,  1992). 

The  interaction  between  the  learner  and  the  environment  not 
only  takes  place  in  the  context  of  an  "action  dialogue"  (Bruner, 
1975,  p.  284),  but  also  within  a  localized  situational  and  larger 
socio-cultural  context  (cf.  Ochs,  1982;  Ochs  &  Schieffelin,  1984; 
Schieffelin  &  Ochs,  1986;  Ochs,  1988;  Schieffelin,  1990). 
Language  learners  are  continually  exposed  to  far  more  information 
than  they  can  possibly  process.  Thus,  a  major  concern  for  language 
acquisition  researchers  is  how  language  learners  selectively  attend  to 
information  in  the  environment,  that  is,  how  input  becomes  intake. 
From  a  neurobiological  perspective,  the  present  paper  first  discusses 
the  distinction  between  input  and  intake  that  is  commonly  made  in 
applied  linguistics.  Because  the  input-intake  distinction  is  intimately 
related  to  the  concept  of  selective  attention,  we  briefly  explore  the 
role  of  selective  attention  in  modulating  information  flow.  Finally, 
we  attempt  to  provide  a  more  fundamental  understanding  of  selective 
attention  by  presenting  a  neural  structure  that  appears  to  be  involved 
in  transforming  input  into  intake. 


INPUT  AND  INTAKE: 
THE  NEED  FOR  SELECTIVE  ATTENTION 


In  second  language  acquisition  (SLA),  input  and  intake  are 
characterized  as  both  objects/products  and  processes.  When  input  is 
characterized  as  an  object,  it  may  be  equated  with  the  source  of 
information  to  which  the  learner  is  attending.  This  source  of 
information  cannot  be  limited  to  exposure  of  a  linguistic  nature 
because,  even  though  a  great  deal  of  sensory  information  is  ignored 
or  discarded  as  irrelevant  by  the  brain,  it  is  never  clear  exactly  what 
the  individual  is  perceiving.  When  input  is  characterized  as  a 
process  (Young,  1988),  it  appears  to  overlap  with  the  psychological 
constructs  of  comprehensible  input  (Krashen,  1981,  1982,  1985) 
and  intake  (Corder,  1967),  which  have  been  hypothesized  to 
"explain"  how  learners  "internalize"  the  linguistic  patterns  available 
in  the  input.    These  constructs  are  presumed  to  account  for  the 
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portion  of  the  input  that  actually  makes  it  into  the  learner's  head  as 
an  organized  and  retrievable  form  of  knowledge.  The  i  in  Krashen's 
well-known  metaphor,  i  +  1,  is  an  attempt  to  characterize  the  state  of 
the  learner's  brain  at  a  given  point  in  acquisition,  but  it  has  been 
severely  criticized  for  failing  to  explain  the  acquisition 
"mechanisms"  involved  (Gregg,  1984).  Chaudron  (1985)  has 
elaborated  on  these  "mechanisms"  by  suggesting  that  intake  is  a 
complex  process  (rather  than  a  product)  involving  the  perception  and 
encoding  of  input,  followed  by  integration  of  linguistic  information 
into  the  developing  grammar. 

As  helpful  as  these  attempts  to  describe  input  and  intake  may 
be,  they  remain  abstract  characterizations  of  learner  behavior  and,  as 
such,  reveal  nothing  of  the  underlying  mechanisms  involved  (Jacobs 
&  Schumann,  1992).  The  present  neurobiological  perspective 
attempts  to  describe  underlying  neural  mechanisms  responsible  for 
processing  input  and  intake.  Thus,  in  the  present  perspective,  input 
is  viewed  as  the  object  of  the  learner's  attention  and  intake  is  viewed 
as  the  product  of  information  processing  in  the  brain,  which  is 
discussed  below. 

Selective  attention:     Modulating  information  flow 

Language  learning  requires  that  the  brain's  processing 
systems  have  access  to  relevant  input.  This  is  accomplished  through 
selective  attention,  a  phenomenon  whereby  an  individual  directs 
attention  towards  and  maintains  attention  on  the  stimuli  of  relevance. 
The  present  section  will  describe  selected  conceptualizations  of  the 
attentional  process:  the  neurobiological  concepts  of  early  and  late 
selection,  and  the  psychological  concepts  of  bottom-up  and  top- 
down  processing. 

Neurobiologically,  meaning  is  constructed  from  the 
interaction  of  sensory  information  (i.e.,  the  "external  context")  with 
prior  knowledge  as  it  is  organized  in  the  brain  (i.e.,  the  "internal 
context")  (Jacobs,  1991).  It  is  the  internal  context  that  influences  an 
individual's  selective  attention  and  subsequent  understanding  of 
input.  Presently,  there  is  disagreement  about  how  the  internal 
context  influences  selective  attention  because,  theoretically  at  least, 
its  influence  may  be  manifested  in  one  of  two  ways:  early  selection 
or  late  selection.  With  regard  to  early  selection,  the  internal  context 
restricts  the  capacity  for  sensory  processing,  necessitating  a 
"filtering"  of  input  based  on  simple  characteristics  of  the  stimulus, 
prior  to  semantic  encoding  (Broadbent,  1958;  Edelman,  1989; 
LaBerge,  1990;  Mangun  &  Hillyard,  1990;  Corbetta,  Miezin, 
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Dobmeyer,  Shulman  &  Petersen,  1991).  With  regard  to  late 
selection,  the  internal  context  sets  no  limit  on  sensory  processing, 
allowing  selective  attention  to  occur  simultaneously  with  or 
following  semantic  encoding  (Edelman,  1989;  Mangun  &  Hillyard, 
1990;  Corbetta  et  al.,  1991).  Whether  it  occurs  early  or  late, 
selective  attention  is  the  outcome  of  multiple  mechanisms  mediating 
action  directed  toward  achieving  goals  or  satisfying  criteria  set  by 
the  individual's  internal  context  (Treisman  1960;  Treisman  & 
Gelade,  1980;  Crick,  1984;  Allport,  1987;  Neumann,  1987; 
Edelman,  1989). 

Psychologically,  there  are  two  Vygotskian  based  theories  for 
how  patterns  in  stimuli  may  be  recognized  and  sensory  events  are 
given  meaning:  bottom-up  processing  and  top-down  processing 
(Rogoff,  1990).  Bottom-up  processing  suggests  that  a  new 
stimulus  is  examined  by  its  basic  elements  or  features.  This 
processing  is  "bottom-up"  because  the  stimulus  must  be  analyzed 
into  specific  features  or  building  blocks  before  being  assembled  into 
a  meaningful  pattern.  Top-down  processing  examines  a  stimulus, 
not  by  discrete  feature  analysis,  but  by  rapid  pattern  organization, 
making  use  of  situational  context.  Both  neurobiologically  and 
psychologically,  learning  thus  begins  with  a  perception  and  focused 
attention  on  a  stimulus. 

Although  psychological  theories  and  information  processing 
models  have  made  significant  progress  in  explaining  the  role  of 
selective  attention  in  learning  (Shiffrin  &  Atkinson,  1969;  Craik  & 
Lockhart,  1972),  they  cannot  adequately  explain  how  one  attends  to 
appropriate  stimuli  nor  do  they  address  the  neural  mechanisms 
involved  in  selective  attention.  However,  a  brain-based  model  can. 
Just  as  Krashen's  (1985)  "affective  filter"  has  found  neural 
correlates  in  the  work  of  Schumann  (1990,  1991),  selective  attention 
appears  to  fall  within  the  domain  of  a  neural  structure  known  as  the 
nucleus  reticularis  thalami  (NRT).  The  following  section  describes 
the  NRT  and  discusses  its  involvement  in  selective  attention. 
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A  PUTATIVE  BRAIN  STRUCTURE 

FOR  SELECTIVE  ATTENTION: 

THE  NUCLEUS  RETICULARIS  THALAMI 


Numerous  networks  involved  in  selective  attention  have 
been  identified  (Posner,  Inhoff,  Friedrich  &  Cohen,  1987;  Robbins 
&  Everitt,  1987;  Desimone  &  Ungerleider,  1989;  LaBerge,  1990; 
Mangun  &  Hillyard,  1990;  Posner  &  Peterson,  1990;  Corbetta  et 
al.,  1990,  1991;  Cohen  &  Rafal,  1991).  However,  it  remains 
unclear  how  these  multiple  systems  work  in  concert.  Although  the 
present  paper  will  not  elaborate  on  the  cohesion  of  this  "cerebral 
symphony"  (Calvin,  1989),  it  will  discuss  a  neural  structure  that 
seems  to  function  as  its  "conductor,"  coordinating  the  expression 
and  salience  of  the  various  instruments  in  this  orchestra.  The 
"conductor"  is  the  nucleus  reticularis  thalami  (NRT). 

The  NRT  is  part  of  the  thalamus,  a  structure  through  which 
all  sensory  information,  with  the  exception  of  olfaction  (=  the  sense 
of  smell),  must  pass  before  being  further  processed  in  the  region  of 
the  brain  known  as  the  cerebral  cortex  (Figure  1).  The  NRT's 
involvement  in  selective  attention  is  not  surprising  because  it  appears 
to  share  both  morphological  and  functional  characteristics  with  the 
brainstem  reticular  formation  (BRF),  a  diffuse  collection  of  neurons 
with  far-reaching  connections  throughout  the  brainstem. ^ 
Morphologically,  NRT  neurons  (=  nerve  cells)  resemble  those  of  the 
BRF  in  terms  of  size  and  general  branching  patterns  (Carpenter  & 
Sutin,  1983).  Functionally,  both  NRT  and  BRF  neurons  are 
involved  in  arousal  (Carpenter  &  Sutin,  1983;  Scheibel,  1984).  The 
BRF  responds  to  sensory  stimulation  and,  via  pathways  that  ascend 
through  the  brainstem  to  the  cortex  (in  particular  the  lemniscal  and 
the  ascending  reticular  activating  systems),  exerts  its  influence  over 
broad  areas  of  the  cortex,  evoking  arousal  responses  (Carpenter  & 
Sutin,  1983).  En  route  to  the  cortex,  the  ascending  pathways 
project  upon  (=  make  connections  with)  regions  of  the  thalamus  and 
thus  involve  the  NRT  (Carpenter  &  Sutin,  1983,  Steriade,  Jones  & 
Llinas,  1990). 

To  illustrate  the  NRT's  role  in  selective  attention,  the 
following  sections  sequentially  discuss  its  (1)  location,  (2) 
connections,  (3)  structure,  and  (4)  physiology.  The  location  and 
connections  of  the  NRT  underscore  its  essential  involvement  in 
selective  attention.  How  the  NRT  participates  in  selective  attention 
becomes  clear  when  one  examines  its  structure  and  physiology. 
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Figure  1.  Photographs  of  the  left  hemisphere  of  the  human  brain.  Lateral  (top 
photo)  and  medial  (bottom  photo)  views  demonstrate  the  relative 
position  of  the  cerebellum  (C)  and  the  four  major  lobes  of  the  cerebral 
cortex:  frontal  (F),  parietal  (P),  occipital  (O)  and  temporal  (T).  In 
addition  to  these  regions,  the  lateral  perspective  illustrates  the 
relative  position  of  the  basal  ganglia  (BG)  and  amygdala  (A),  both  of 
which  are  deep  to  the  cerebral  cortex.  The  medial  perspective 
reveals  the  relative  position  of  the  thalamus  (Th),  cingulate  gyrus 
(Cg),  brainstem  (B),  and  the  temj>oral  lobe  (Tp).  Running  throughout 
the  brainstem  is  the  diffuse  network  of  cells  known  as  the  brainstem 
reticular  formation  (BRF)  (not  shown).  The  framed  area  in  the  medial 
view  outlines  the  portion  enlarged  for  Figure  2.  Ant  =  Anterior,  Post 
=  Posterior. 
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Location 


As  mentioned  earlier,  language  acquisition  is  dependent  on  the 
primary  senses  (i.e.,  audition  and  vision)  for  bringing  linguistic  and 
contextual  information  into  the  brain.  All  sensory  information 
entering  the  brain,  with  the  exception  of  olfaction,  passes  through 
the  thalamus,  where  it  is  associated  and  synthesized,  before 
proceeding  to  the  cerebral  cortex.  The  cerebral  cortex,  which 
processes  and  responds  to  sensory  information,  is  traditionally 
divided  into  four  major  regions,  or  lobes,  each  of  which  serves  a 
specific    processing  function    (Figure  1).     In  order  for  sensory 


Figure  2.  Enlarged  view  of  the  medial  surface  of  the  human  brain  as  framed  in 
Figure  1.  This  perspective  highlights,  somewhat  schematically,  the 
position  of  the  nucleus  reticularis  thalami  (NRT)  on  the  dorsal 
portion  of  the  thalamus.  By  virtue  of  its  position  in  the  center  of 
information  flow,  all  incoming  sensory  impulses  (e.g.,  from  the 
cerebellum  and  through  the  brainstem)  must  pass  through  the 
thalamus  and  the  NRT  en  route  to  the  cerebral  cortex  (single  headed 
arrows).  There  is  continual  two-way  communication  between  the 
cerebral  cortex  and  the  thalamus  (double  headed  arrows),  all  of  which 
also  passes  through  the  NRT. 
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information  to  reach  appropriate  areas  of  the  cortex,  it  must  be 
directed  there.  The  thalamus  serves  this  function. 

The  thalamus  consists  of  several  nuclei,  or  groups  of  cells 
sharing  the  same  function.  One  of  these  nuclei  is  the  NRT,  a  sheet- 
like complex  of  cells  enveloping  thalamic  nuclei  committed  to 
sensory  and  associative  functions  (Scheibel  &  Scheibel,  1966; 
Jones,  1975,  1985;  Skinner  &  Yingling,  1977;  Angel,  1983; 
Avanzini,  de  Curtis,  Panzica  &  Spreafico,  1989).  As  shown  in 
Figure  2,  all  connections  from  the  thalamus  to  the  cortex 
(thalamocortical)  and  back  (corticothalamic)  pass  through  the  NRT 
(Jones,  1985).  By  virtue  of  its  location,  the  NRT  is  intimately 
involved  in  the  modulation  of  all  communication  between  the 
thalamus  and  the  cortex.  It  constitutes  a  high  resolution  organic 
"screen"  capable  of  monitoring  and  modulating  thalamo-cortico- 
thalamic  interactions  (Scheibel  &  Scheibel,  1966),  a  "screen"  that 
preferentially  enhances  certain  aspects  of  stimuli  and  simultaneously 
attenuates  the  salience  of  other  input. 

Connections 

Information  is  organized  topographically  in  the  central 
nervous  system  (CNS),  which  consists  of  the  brain  and  the  spinal 
cord.  In  other  words,  external  stimuli  are  mapped  onto  the  CNS  in 
an  orderly  fashion.  Auditory  impulses  are  mapped  tonotopically 
(i.e.,  frequency  relationships  between  sounds  are  preserved  from 
the  cochlea  of  the  inner  ear  to  the  auditory  cortex).  Visual  stimuli 
are  organized  retinotopically  (i.e.,  visual  field  images  on  the  retina 
are  transferred  faithfully  to  the  visual  cortex).  Tactile  and  motor 
information  are  represented  somatotopically  (i.e.,  the  relationships 
between  parts  of  the  body  are  maintained  in  CNS  representations). 
The  brain's  topographical  organization  permits  sensory  information 
to  be  processed  efficiently  in  circumscribed  networks  dedicated  to  a 
particular  type  of  information.  The  NRT  also  adheres  to  a  general 
topographical  organization  (Jones,  1975,  1985),  which  helps  it  to 
direct  in  an  orderly  manner  the  enormous  amount  of  sensory 
information  ascending  to  the  cortex.  The  NRT's  topographical 
relationship  with  the  thalamus  and  cortex  is  evidenced  by  the 
constant,  relatively  circumscribed  regions  of  the  NRT  through 
which  thalamocortical  and  corticothalamic  fibers  of  particular 
thalamic  nuclei  cross  (Steriade  et  al.,  1990).  Regions  of  the  NRT 
are  thereby  associated  with  a  thalamic  nucleus  or  group  of  nuclei 
(primarily  in  the  upper-most  or  dorsal  portion  of  the  thalamus)  and 
hence  a  sensory  or  functional  system  (Steriade  et  al.,  1990).    As 
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thalamocortical  and  corticothalamic  fibers  pass  through  the  NRT,  the 
NRT  is  capable  of  influencing  information  flow.  The  NRT  works 
as  a  "gating  mechanism"  (see  below),  permitting  (i.e.,  selecting) 
passage  of  specific  information  for  further  processing  (Skinner  & 
Yingling,  1977;  Yingling  &  Skinner,  1977;  Scheibel,  1987;  Steriade 
et  al.,  1990). 

Regions  of  the  NRT  that  may  be  involved  with  language  are 
implicated  in  research  conducted  by  Ojemann  (1975,  1976,  1984). 
Using  electrical  stimulation  of  the  human  brain  in  patients 
undergoing  surgery  for  intractable  epilepsy,  Ojemann  identified 
specific  areas  of  the  thalamus  involved  with  language,  arousal,  and 
verbal  memory.  Projecting  a  series  of  slides  on  a  screen,  Ojemann 
tested  his  subjects'  ability  to  name  and  recall  objects  while 
electrically  stimulating  areas  of  the  thalamus.  Electrical  stimulation 
acts  as  a  temporary,  reversible  lesion  that  helps  identify  structures 
involved  in  a  particular  function.  Four  basic  types  of  language 
disturbances  resulted  from  this  stimulation:  arrest  of  speaking, 
anomia,  perseverance,  and  repetition.^  These  dysfunctions  were 
identified  by  stimulation  of  discrete  areas  of  the  left  thalamus 
(especially  the  ventrolateral  portion).  Within  this  context,  the  NRT 
theoretically  acts  as  a  modulator  of  nerve  impulses  conducted 
through  thalamocortical  fibers  (axons)  originating  in  the  left 
ventrolateral  thalamic  nuclei.  The  nerve  signals  passing  through  the 
NRT  are  believed  to  prime  the  cortex  of  the  left  hemisphere  for 
incoming,  hnguistically  relevant  information  (Ojemann,  1975). 

Many  structures  besides  the  cortex  are  involved  in  learning. 
Four  structures  are  of  particular  interest:  (1)  the  hippocampus,  (2) 
the  cerebellum,  (3)  the  basal  ganglia,  and  (4)  the  cingulate  gyrus 
(Edelman,  1989;  Seib,  1990).  The  hippocampus,  which  is  located 
deep  within  the  temporal  lobe,  consolidates  recently  acquired 
information  and  is  involved  in  laying  down  new  memories, 
including  spatial  and  episodic  memory  (Edelman,  1985;  Rolls, 
1990;  Kandel,  Schwartz  &  Jessell,  1991).  The  NRT  may  assist  the 
hippocampus  by  focusing  and  maintaining  attention  on  relevant 
stimuli. 

The  NRT  may  similarly  facilitate  cerebellar  processing. 
Traditionally  the  cerebellum  (shown  in  Figure  1),  which  sends 
information  to  the  cortex  via  the  thalamus,  has  been  recognized  for 
its  involvement  in  motor  coordination.  More  recently  the  cerebellum 
has  been  implicated  in  cognitive  functions  such  as  the  learning  of 
rote  memories  (Edelman,  1989;  Schmahmann,  1991;  Robbins,  this 
volume).  To  cite  one  example,  Petersen,  Fox,  Posner,  Mintun  and 
Raichle   (1989)   monitored   subjects   performing   a   semantic 
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association  task.  The  task  required  subjects  to  generate  a 
semantically-associated  verb  for  a  series  of  nouns.  Using  Positron 
Emission  Tomography  (PET),  an  imaging  technique  yielding 
anatomical-functional  correlations,  they  discovered  activation  (i.e., 
increased  activity  and  therefore  greater  processing  demand)  in 
regions  of  the  cerebellum  distinct  from  areas  involved  in  motor 
tasks.  These  results  strongly  suggest  "cognitive"  functions  be 
added  to  the  cerebellum's  well-known  sensory  and  motor  repertoire. 
The  basal  ganglia,  which  consist  of  several  substructures  (caudate, 
putamen,  globus  pallidus,  amygdala,  and  claustrum),  serve  an 
associative  function,  connecting  sensory  and  conceptual 
categorization  with  motor  responses  (Edelman,  1989)  (Figure  1)^. 
Many  basal  ganglia  interconnections  with  the  cortex  are  mediated  by 
the  thalamus.  With  a  passive  presentation  of  visual  words, 
Petersen,  Fox,  Posner,  Mintun,  and  Raichle  (1989)  found  that  an 
area  of  the  basal  ganglia  (the  left  lateralized  area,  possibly  the 
putamen)  was  activated.  They  concluded  that  the  basal  ganglia  may 
be  involved  in  lexical  or  letter  level  processing.  The  NRT  may 
influence  processing  in  the  basal  ganglia  by  indirectly  modulating 
information  traveling  between  the  cortex  and  the  basal  ganglia 
(particularly  to  the  putamen  and  caudate  nucleus)  (Carpenter  & 
Sutin,  1983). 

The  NRT  also  works  in  concert  with  the  cingulate  gyrus, 
which  is  part  of  a  somewhat  diverse  collection  of  brain  areas  known 
as  the  limbic  system  (Figure  1).  The  limbic  system  contributes  to 
emotion,  memory,  and  the  coordination  of  value-dependent  states 
(Kandel  et  al.,  1991;  LeDoux,  1992).  Petersen  and  colleagues 
(1989)  suggest  that  the  cingulate  gyrus  may  be  involved  in  response 
selection.  Subjects  were  asked  to  perform  two  tasks:  the  "generate 
uses"  and  the  "semantic  monitoring"  tasks.  The  generate  uses  task 
required  subjects  to  utter  semantically  associated  verbs  to  a  string  of 
visually  and  aurally  presented  nouns.  This  task  involved  both 
lexical  processing  and  response  selection.  The  semantic  monitoring 
task  required  subjects  to  read  a  list  of  nouns  and  report  the 
proportion  of  nouns  belonging  to  a  particular  semantic  category 
(e.g.,  dangerous  animals).  Two  lists  were  presented:  one  with  a 
small  proportion  of  nouns  belonging  to  the  target  category  (i.e., 
1/40),  and  a  second  with  a  larger  proportion  of  nouns  belonging  to 
the  target  category  (i.e.,  20/40).  This  task  required  both  semantic 
processing  and  association.  Peterson  and  colleagues  (1989)  found 
increased  activity  in  the  cingulate  gyrus  (particularly  its  anterior 
portion)  for  tasks  requiring  a  high  level  of  attention  and  response 
selection.  The  activated  region  of  the  cingulate  gyrus  corresponded 
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to  an  area  identified  through  lesion  studies  as  being  involved  in 
spontaneous  speech.  The  cingulate  gyrus  may  thus  be  involved  in 
spontaneous,  cortically  induced  reflex  movements  (Carpenter  & 
Sutin,  1983;  Paxinos,  1990;  Lem,  this  volume). 

Although  the  limbic  system  is  diffuse,  it  brings  together 
principal  pathways  connecting  thalamic  nuclei  with  several  other 
structures  (Carpenter  &  Sutin,  1983),  including  (1)  the  BRF 
(involved  in  arousal),  (2)  the  hippocampus  (involved  in  memory), 
and  (3)  the  amygdala  (involved  in  affective  evaluation  of  stimuli — 
Schumann,  1990,  1991)  (Figure  1).  Thus,  through  both  direct  and 
indirect  connections,  the  NRT  is  in  a  position  to  modify  critical 
linguistic  processes  carried  out  by  the  above-mentioned  structures. 
Knowledge  of  the  NRT's  location  and  connections  establishes  its 
participation  in  information  flow  and  suggests  a  possible  role  in 
learning.  An  examination  of  the  structure  of  the  NRT  reveals  its 
function,  specifically  how  it  participates  in  selective  attention. 

Structure 

The  NRT  modulates  communication  to  and  from  the  cerebral 
cortex  by  synapsing  (i.e.,  forming  "communicative  junctions")  with 
thalamocortical  and  corticothalamic  axons.  Because  the  dendrites 
(i.e.,  the  receptive  branches)  of  NRT  neurons  run  parallel  to  the 
boundaries  of  the  NRT,  all  thalamocortical  and  corticothalamic 
axons  must  come  into  contact  with  these  dendrites  (Figure  3) 
(Scheibel  &  Scheibel,  1966;  Jones,  1985).  As  these  fibers  pass 
through  the  NRT,  they  continually  share  information  with  the 
dendrites  of  NRT  neurons.  Constant  communication  between  the 
thalamus  and  the  cortex  guarantees  that  NRT  neurons  receive 
continuously  updated  information. 

The  NRT's  ability  to  monitor  information  flow  is  enhanced 
by  filament  clusters  located  at  the  end  of  each  dendrite  (Figure  3) 
(Scheibel  &  Scheibel,  1966;  Jones,  1985).  Just  as  leaves  increase 
the  surface  area  of  a  tree,  and  therefore  increase  its  ability  to  absorb 
sunlight,  filament  clusters  increase  the  dendrite's  ability  to  absorb 
information  (Jacobs  &  Schumann,  1992). 
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Figure  3  Representative  illustration  of  the  nucleus  reticularis  thalami's  (mT) 
extensive  network  of  dendrites  (d)  as  they  run  parallel  to  the  NRTs 
boundaries.  These  dendrites  end  in  filament  clusters  (d^),  which 
enhance  the  NRT's  capacity  to  receive  impulses  from  collaterals  (tc  , 
ct^)  of  traversing  thalamocortical  (tc)  and  corticothalamic  (ct)  fibers. 
Nucleus  reticularis  thalami  (NRT)  neurons  (n)  also  emit  axons  (a), 
which  contribute  collaterals  (a^)  to  the  NRT  proper  before  descending 
toward  underlying  thalamic  nuclei  (Th).  (Based  on  Golgi  drawmgs  of 
cat  tissue  in  Scheibel  &  Scheibel,  1966.) 
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Figure  4.  Highly  schematized  drawing  illustrating  the  major  components  of  the 
NRT's  feedback  circuit.  Afferent  (incoming)  information  (A)  travels 
to  thalamic  relay  cells  (R)  from  several  sources  (e.g.,  the  cerebellum 
and  sensory  receptors).  These  thalamic  relay  cells  continually 
communicate  with  the  cerebral  cortex  (C).  Thalamocortical  and 
corticothalamic  communication,  however,  is  differentially  affected  by 
thalamic  intemeurons  (I)  and  NRT  neurons  (N),  which  function 
independently  or  in  conjunction  with  each  other.  The  brainstem 
reticular  formation  (BRF)  contributes  to  this  system  via  projections 
to  the  thalamus  and  NRT  neurons  en  route  to  the  cortex. 
(Synthesized  from  Skinner  &  Yingling,  1977;  Scheibel,  1980; 
LaBerge,  1990.) 

The  NRT  is  not  only  a  passive  "eavesdropper."  It  can  also 
intervene  in  communication  between  the  thalamus  and  the  cortex  via 
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its  axons.  After  emitting  a  few  collaterals  (=  short  extensions)  that 
remain  within  the  NRT  itself,  NRT  axons  project  diffusely  to  the 
underlying  thalamus  (i.e.,  the  dorsal  thalamus)  (Steriade  et  al., 
1990).  NRT  axons  contact  two  types  of  cells  in  the  various  thalamic 
nuclei:  relay  cells  and  interneurons.  Relay  cells  are 
characteristically  excitatory,  that  is,  they  promote  information  flow. 
Relay  cells  process  a  single  sensory  modality,  project  to  specific 
regions  of  the  cerebral  cortex,  and  receive  continually  updated 
information  from  cortical  regions  to  which  they  project  (Kandel  et 
al.,  1991).  In  contrast,  interneurons  are  primarily  inhibitory.  They 
do  not  project  beyond  the  boundaries  of  the  thalamus,  but  serve  as 
an  inhibitory  interface  between  thalamic  cells. 

It  has  been  suggested  that  NRT  projections  to  relay  cells  and 
interneurons  of  other  thalamic  nuclei  establish  part  of  a  feedback 
system  which  provides  a  mechanism  for  focusing  attention  (Scheibel 
&  Scheibel,  1972;  Scheibel,  1981).  Other  projections  contributing 
to  the  NRT's  feedback  circuitry  are  the  collaterals  from  traversing 
thalamocortical  and  corticothalamic  axons,  as  well  as  projections 
ascending  through  the  brainstem  to  the  cortex.  The  primary 
components  of  the  NRT's  feedback  system  are  thus:  (1)  the 
thalamic  sensory  nuclei,  (2)  the  cortex  and  (3)  the  BRF  (Hobson  & 
Scheibel,  1980;  Jones,  1985;  LaBerge,  1990;  Steriade  et  al.,  1990). 
The  major  components  of  this  feedback  system  continually 
communicate  with  each  other  regarding  sensory  input.  The  result  of 
this  continual  communication  is  the  transformation  of  selected  input 
into  intake.  This  feedback  system,  represented  schematically  in 
Figure  4,  is  discussed  in  greater  detail  below. 


Physiology^ 


Enhancing   the    contrast   of  incoming   information.   The 

NRT  modulates  a  topographically  organized  feedback  system. 
Within  this  system,  the  NRT  (1)  monitors  continuously  updated 
communication  between  interconnected  structures  (e.g.,  thalamus 
and  cortex)  and  (2)  promotes  the  activity  of  selected  neurons  as  they 
transmit  sensory  information  to  associated  regions  of  the  cortex. 
One  of  the  main  functions  of  NRT  modulation  is  to  enhance  contrast 
among  incoming  sensory  signals,  a  phenomenon  crucial  to  CNS 
activity  because  neurons  respond  preferentially  to  contrast.  In 
general,  neurons  prefer  novel  information  and  tend  to  cease 
responding  (i.e.,  habituate)  to  repetitive  and/or  non-meaningful 
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Stimuli.  Habituation  helps  explain,  for  example,  how  a  person  can 
forget  eyeglasses  on  his/her  forehead.  It  has  also  been  suggested 
that  because  the  CNS  becomes  accustomed  to  steady  states  of 
sensory  impulses,  pedagogical  methods  such  as  Audio-lingualism, 
which  emphasize  parrot-like  repetition  of  chained  phrases  relatively 
devoid  of  meaning,  have  inherent  neurobiological  shortcomings 
(Jacobs  &  Schumann,  1992). 

The  NRT  is  active  in  selectively  enhancing  and 
simultaneously  suppressing  (inhibiting)  information  that  flows 
through  thalamocortical  fibers.  Ascending  impulses  from  active 
thalamic  neurons  are  thus  enhanced  as  the  NRT  attenuates  the 
expression  of  surrounding  neurons,  creating  a  greater  contrast  or 
signal-to-noise  ratio  (LaBerge,  1990).  The  NRT's  inhibitory  nature 
is  evidenced  by  the  fact  that  virtually  all  NRT  neurons  contain  a 
chemical  (the  neurotransmitter  gamma-aminobutyric  acid,  or  GABA) 
that  typically  has  an  inhibitory  effect. 

NRT  effects  on  behavioral  states.  Most  research  establishing 
the  inhibitory  nature  of  the  NRT  uses  the  electroencephalograph,  an 
instrument  that  records  the  electrical  activity  of  the  brain  (LaBerge, 
1990;  Steriade  et  al.,  1990).  This  instrument  correlates  cell 
functions  with  electrical  phenomena  (=  currents). 
Electrophysiologically,  the  NRT  serves  as  a  "pacemaker"  of 
thalamic  activity  (Hobson  &  Scheibel,  1980;  Steriade  et  al.,  1990). 
Thalamic  neurons  have  two  distinct  electrical  discharge  patterns, 
each  of  which  is  associated  with  a  different  behavioral  state.  The 
first  is  characterized  by  synchronous  fluctuations,  which  are 
associated  with  drowsiness  and  deep  sleep.  This  synchronous 
activity  in  the  NRT  seems  to  prevent  sensory  information  traveling 
through  thalamocortical  fibers  from  being  forwarded  to  the  cortex 
for  further  processing  (Steriade  et  al,  1990).  The  second  type  of 
current  is  characterized  by  asynchronous  bursts.  These  are 
associated  with  brain-activated  behavioral  states,  such  as 
wakefulness,  arousal  and  rapid  eye  movement  (REM)  sleep 
(Steriade  et  al.,  1990).  Asynchronous  activity  in  the  NRT  enhances 
the  transmission  of  impulses  ascending  from  the  thalamus  to  the 
cortex  (Steriade  et  al.,  1990),  a  key  aspect  of  selective  attention  and 
learning.  Because  the  NRT  projects  to  virtually  all  thalamic  nuclei, 
it  appears  to  conduct  the  rhythmicity  of  the  thalamus,  thereby 
continuously  modulating  input  to  the  cortex  (Skinner  &  Yingling, 
1977). 
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The  NRT  as  a  "gating''  mechanism.  Selective  attention 
appears  to  result  from  the  NRT's  inhibition  of  thalamocortical 
impulses  carrying  irrelevant  information.  This  selective  inhibition 
implies  that  the  NRT  can  somehow  discriminate  between  relevant 
and  irrelevant  stimuli  in  a  given  context  (Skinner  &  Yingling, 
1977). 5  As  mentioned  above,  the  NRT  is  part  of  an  inhibitory 
feedback  system  which  brings  together  information  from  the  BRF, 
the  cortex,  and  the  thalamic  sensory  nuclei  (Yingling  &  Skinner, 
1977;  Scheibel,  1987)  (Figure  4).  Within  this  system,  the  NRT 
serves  as  a  "gating  mechanism."  If  NRT  "gates"  are  open, 
information  flow  to  the  cortex  is  promoted;  if  the  "gates"  are  closed, 
information  flow  to  the  cortex  is  inhibited  (Scheibel,  1987;  Steriade 
et  al.,  1990).  Both  the  BRF  and  the  cortex  (particularly  the 
prefrontal  region)  affect  the  NRT's  gating  function. 


Figure  5.  Highly  schematized  drawing  representing  the  physiology  of  the 
NRT's  inhibitory  feedback  circuit.  Brainstem  reticular  formation 
(BRF)  projections  excite  thalamic  relay  cells  (R)  and  inhibit  NRT 
neurons  (N).  The  BRF  can  suppress  the  inhibitory  effect  of  NRT 
axons  on  thalamic  relay  cells,  thereby  opening  the  NRT  "gate"  and 
promoting  thalamocortical  information  flow.  The  BRF  can  also 
suppress  the  inhibitory  effect  of  NRT  axons  on  thalamic 
interneurcns  (I),  allowing  thalamic  intemeurons  to  resume  their 
suppressive  control  of  thalamic  relay  cells.  In  this  manner,  the 
NRT  "gate"  is  closed,  restricting  thalamocortical  information  flow. 
The  cerebral  cortex  (C)  (particularly  the  prefrontal  region)  exerts 
descending  control  over  cells  in  the  thalamus  and  the  NRT,  and  has 
either  an  excitatory  or  inhibitory  effect.  The  inhibitory  effect  of 
collaterals  from  corticothalamic  fibers  on  NRT  neurons  is  similar  to 
that  of  BRF  projections.  The  excitatory  effect  of  corticothalamic 
collaterals  increases  the  inhibitory  effect  of  NRT  axons  on  thalamic 
relay  cells,  closing  the  NRT  "gate"  and  restricting  thalamocortical 
information  flow.  The  excitatory  effect  of  corticothalamic 
collaterals  also  increases  the  inhibitory  effect  of  NRT  axons  on 
thalamic  intemeurons,  suppressing  the  inhibitory  effect  of  these 
intemeurons,  thereby  allowing  thalamic  relay  cells  to  resume  their 
communication  with  the  cortex.  The  NRT  "gate"  is  thus  open, 
promoting  thalamocortical  information  flow.  (Synthesized  from 
Scheibel  &  Scheibel,  1966;  Jones,  1975;  Skinner  &  Yingling, 
1977;  Yingling  &  Skinner,  1977;  Scheibel,  1980;  Scheibel,  1984; 
Jones,  1985;  LaBerge,  1990;  Steriade  et  al.,  1990.) 
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BRF    influences    on    the    gating    mechanism.    The     BRF 

regulates  attention  of  a  more  general,  reflexive  nature  (Skinner  & 
Yingling,  1977;  Yingling  &  Skinner,  1977;  Scheibel,  1980). 
Selective  attention  driven  by  the  BRF  helps  explain,  for  example, 
how  an  individual  directs  his/her  attention  towards  the  source  of  a 
loud  noise.  As  illustrated  in  Figure  5,  axons  ascending  to  the  cortex 
from  the  BRF  project  to  the  thalamus,  facilitating  thalamocortical 
communication.  All  thalamic  nuclei  under  the  BRF's  influence  send 
sensory  information  to  their  associated  areas  of  the  cortex  for  further 
processing. 

BRF  projections  to  the  NRT  have  an  inhibitory  effect. 
When  NRT  neurons  are  inhibited,  their  suppressive  control  of  both 
thalamic  relay  cells  and  intemeurons  is  removed.  As  thalamic  relay 
cells  are  released  from  NRT  suppression,  thalamocortical 
information  flow  is  promoted.  The  NRT  "gate"  is  thus  opened.  As 
thalamic  interneurons  are  released  from  NRT  suppression, 
thalamocortical  information  flow  is  suppressed  because  the 
interneurons  resume  their  inhibition  of  thalamic  relay  cells.  The 
NRT  "gate"  is  thus  closed.  The  complex  sequence  of  opened  and 
closed  gates  appears  to  be  responsible  for  the  selective  processing  of 
sensory  information  by  the  cortex  (Scheibel,  1980). 

Cortical  influences  on  the  gating  mechanism.  Information 
flow  is  regulated  not  only  from  "below"  by  the  BRF,  but  also  from 
"above"  by  the  cortex.  This  process  is  known  as  descending  control 
and  explains  how  the  cerebral  cortex  can  help  select  the  information 
it  receives  from  the  environment.  The  cortex,  in  particular  the 
projections  from  the  prefrontal  area  to  the  midline  nuclei  of  the 
thalamus  (the  intralaminar,  nonspecific  thalamic  nuclei),  tends  to 
regulate  more  discriminate,  voluntary  types  of  attention  (Skinner  & 
Yingling,  1977;  Scheibel,  1980).  Selective  attention  driven  by  the 
prefrontal  cortex  helps  explain,  for  example,  how  an  individual  is 
able  to  focus  his/her  attention  on  one  speaker  in  a  noisy  room  (i.e., 
the  cocktail  party  phenomenon)  and  perhaps  on  specific  aspects  of 
linguistic  input  (Jacobs,  1988). 

As  for  the  circuitry  itself,  collaterals  from  prefrontal 
projections  to  the  thalamus  can  have  either  a  facilitatory  or  inhibitory 
effect  on  NRT  neurons.  Inhibition  of  NRT  neurons  by  the  cortex  is 
similar  to  NRT  inhibition  by  the  BRF  (discussed  above). 
Facilitation  of  NRT  neurons  increases  the  NRT's  inhibitory  effect 
on  active  thalamic  relay  cells  and  intemeurons.  The  NRT's  direct 
connections  with  thalamic  relay  cells  suppress  the  two-way 
communication  between  relay  cells  and  the  cortex.  In  this  manner, 
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the  NRT  "gate"  is  closed  and  thalamocortical  information  is 
inhibited.  However,  the  NRT  is  also  able  to  promote 
thalamocortical  information  flow  through  its  connections  with 
thalamic  intemeurons.  In  this  manner,  the  NRT  "gate"  is  opened. 
As  relay  cells  are  released  from  the  suppressive  control  of 
intemeurons,  thalamocortical  information  flow  is  promoted. 

Whereas  the  BRF  influences  NRT  control  of  more  general, 
reflexive  attention,  the  cortex  plays  more  specifically  on  the  NRT  by 
influencing  NRT  control  of  discriminative,  voluntary  forms  of 
attention. 6  As  such,  cortical  influences  are  particularly  important  for 
the  selective  attention  necessary  to  attach  meaning  to  sensory 
signals,  including  those  of  a  linguistic  nature  (Jacobs,  1988). 


CONCLUDING  REMARKS: 

SPECULATIONS  ON  THE  NRT'S  RELEVANCE 

TO  LANGUAGE  ACQUISITION 


Although  a  great  deal  is  known  about  the  neurobiology  of 
the  NRT,  its  role  in  language  acquisition  is  largely  speculative.  The 
neurobiological  factors  presented  above  strongly  suggest  the  NRT, 
in  concert  with  other  brain  structures,  plays  a  crucial  role  in 
selectively  processing  sensory  input,  including  input  relevant  to 
language.  The  key  assumption  here  is  that  information  ascending 
through  the  NRT  to  the  cerebral  cortex  constitutes  intake  (or  at  least 
potential  intake),  which  is  the  metaphorical  equivalent  of  integrated 
and  retrievable  neural  representations.  Such  neural  representations 
of  the  external  miUeu  (including  knowledge  of  language)  provide  the 
building  blocks  for  continued  knowledge  (language)  acquisition. 

In  early  primary  language  acquisition,  it  is  likely  that  the 
NRT  functions  in  close  association  with  the  general  arousal 
mechanisms  of  the  young  child.  As  discussed  above,  the  NRT's 
role  in  selective  attention  constitutes  a  refinement  of  the 
phylogenetically  older  BRFs  general  arousal  and  alerting  functions, 
which  are  particularly  important  for  a  developing  brain  that  is 
continually  exposed  to  and  shaped  by  a  pre-  and  post-natal 
environment  rich  in  contextually  supported  novelty  (Jacobs,  1988). 
The  novel  environment  is  essential  because  stimulus  selection  is 
often  based  on  novelty  or  relevance,  as  determined  by  the  learner's 
prior  knowledge  (i.e.,  "internal  context")  (Jacobs,  1991).  Because 
the  fetus  and  neonate  are  particularly  responsive  to  the  acoustic 
environment    (Brazelton,    1986;    Turkewitz,    1988;    Fernald, 
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Taeschner,  Dunn,  Papousek,  de  Boysson-Bardies  &  Fukui,  1989; 
Fernald,  1990),  and  because  the  brainstem  is  functionally  active  at 
birth  (Chugani,  Phelps  &  Mazziotta,  1987),  it  seems  likely  that  the 
NRT,  under  the  influence  of  the  BRF,  can  preferentially  direct 
attention  to  those  aspects  of  the  environment  important  for  language 
acquisition  (e.g.,  rhythmicity,  intonation,  frequency  variation,  and 
the  phonetic  components  of  speech)  (Morse,  1972;  DeCasper  & 
Fifer,  1980;  DeCasper  &  Spence,  1986). 

The  prosodic  modifications  attracting  the  neonate's  attention 
during  this  crucial  period  of  brain  development  exhibit  primarily  a 
social-regulatory  function  (e.g.,  regulating  arousal  and  attention; 
expressing  communicative  intentions  of  an  affective  nature  such  as 
approval,  prohibition,  comfort)  (Stem,  Spieker  &  MacKain,  1982; 
Stern,  Spieker,  Barnett  &  MacKain,  1983;  Fernald,  1991). 
Towards  the  end  of  the  first  year,  however,  prosody  serves  more  of 
a  linguistic  or  analytic  function  (e.g.,  focusing  attention  on  objects, 
enhancing  the  perceptual  salience  of  individual  words,  marking 
linguistic  units)  (Stem  et  al.,  1983).  This  may  help  the  child  learn 
new  lexical  items,  parse  the  speech  stream,  and  identify  syntactic 
units  (Gleitman  &  Wanner,  1982;  Peters,  1983;  Morgan,  1986; 
Morgan,  Meier  &  Newport,  1987;  Fernald,  1991),  thereby  laying 
the  foundation  for  further  language  acquisition. 

Although  the  BRF  maintains  its  ability  to  open  the  NRT 
"gate"  for  the  lifetime  of  the  organism,  the  cerebral  cortex  gradually 
assumes  greater  influence  over  the  NRT  as  the  brain  matures. 
Cortical  descending  control  over  the  NRT  would  thus  assume 
increasing  responsibility  in  later  stages  of  first  language  acquisition 
and  in  second  language  acquisition.  This  is  especially  true  for  the 
descending  influence  of  the  prefrontal  area,  which  does  not  achieve 
functional  maturity  until  the  second  decade  of  life  (Chugani  et  al., 
1987).  In  regulating  discrete  aspects  of  attention,  the  prefrontal 
cortex  provides  an  individual  with  a  certain  degree  of  control  over 
the  sensory  (including  linguistic)  information  passing  through  the 
NRT  gate.  After  contextualized  linguistic  input  (i.e.,  external 
context)  is  directed  to  the  appropriate  cortical  area,  it  may  be 
integrated  with  extant  neural  structures  (i.e.,  internal  context).  In 
this  manner,  cortical  influences  are  particularly  important  for  the 
selective  attention  necessary  to  attach  meaning  to  sensory  signals, 
including  those  of  a  linguistic  nature  (Jacobs,  1988). 

In  conclusion,  several  neural  structures  work  in  concert  to 
manage  the  overwhelming  array  of  environmental  input  available  to 
the  language  learner  by  selecting  and  enhancing  relevant  information 
to  augment  previously  stored  knowledge.    These  brain  regions 
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appear  to  be  coordinated  by  the  NRT  which,  on  the  basis  of  its 
structure  and  location,  is  ideally  suited  for  the  task  of  monitoring 
information  as  it  flows  to  and  from  the  cerebral  cortex.  Although 
the  claims  of  the  present  paper  derive  primarily  from  work  in 
nonhuman  animals,  the  fundamental  neural  principles  presented  here 
extend  to  humans  without  difficulty  (Jacobs  &  Schumann,  1992). 
The  NRT's  posited  role  in  language  acquisition  is  at  present  neither 
directly  observable  nor  testable.  Nevertheless,  the  present 
discussion  demonstrates  that  neurobiology  provides  plausible 
mechanisms  (rather  than  metaphors)  for  understanding  how  learners 
internalize  linguistic  patterns  available  in  input. 
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NOTES 


1  The  brainstem  reticular  formation  (BRF)  is  one  of  the  oldest 
structures  in  the  brain  and  is  essential  for  the  organism's  survival. 
Neural  pathways  ascending  to  the  cortex  contribute  collaterals  to  the 
BRF,  providing  it  with  continually  updated  information.  Some  of 
these  collaterals  maintain  their  sensory  specific  modality  and  serve 
as  sensory  relay  systems.  Other  collaterals  lose  their  sensory 
specific  identity,  but  acquire  the  capability  to  activate  electrocortical 
currents  characteristic  of  the  brain's  attentive  or  aroused  state. 

2  Arrest  of  speaking  refers  to  the  subject's  inability  to  speak 
at  all  during  stimulation.  Anomia  refers  to  the  inability  to  name 
objects  during  stimulation.  Perseverance  refers  to  the  repetition  of 
the  object's  correct  name  or  first  letter  of  the  object's  name  during 
stimulation.  Repetition  refers  to  the  repeated  utterance  of  a  wrong 
word  during  stimulation. 
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3  Although  the  amygdala  is  phylogenetically  associated  with 
the  basal  ganglia,  it  is  functionally  associated  with  the  limbic  system 
(Kandel  et  al.,  1991). 

4  For  the  sake  of  clarity,  this  section  oversimplifies  the 
physiological  story  considerably.  A  more  detailed  discussion  of  the 
physiology  can  be  found  elsewhere  (e.g..  Skinner  &  Yingling, 
1977;  Yingling  &  Skinner,  1977;  Scheibel,  1984;  Steriade  et 
al.,1990). 

5  It  is  extremely  difficult  to  ascertain  precisely  what  input  is 
"relevant"  to  the  learner  because  relevance  is  determined  by  the 
complex  interplay  of  many  factors,  including  (1)  the  internal  context 
of  the  learner,  (2)  the  external  or  situational  context,  and  (3)  the 
period  of  development  during  which  the  individual  is  exposed  to  a 
given  input. 

6  Descending  control  can  be  clarified  by  example.  When  an 
individual  watches  television,  the  BRF's  general,  reflexive  influence 
is  apparent  when  attention  is  captured  by  a  sudden  change  of  scene, 
unexpected  action,  or  fluctuation  in  volume.  The  descending  control 
of  the  cerebral  cortex  is  manifested  when  an  individual  manages  to 
maintain  attention,  despite  drowsiness,  in  order  to  comprehend  key 
elements  of  the  program. 
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Research  in  L2  attrition  is  a  relatively  new  enterprise  which 
is  in  need  of  a  comprehensive  theory /model.  This  paper  presents  a 
tentative  cognitive-psychological  model  of  language  attrition,  which 
draws  on  information  from  studies  in  L2  attrition,  neurobiology, 
and  psychology.  This  is  to  demonstrate  that  a  model  based  on 
consideration  of  the  brain  has  the  potential  of  providing  a  plausible 
account  of  the  process  of  language  attrition,  as  well  as  the  process  of 
language  acquisition. 


INTRODUCTION 


Language  attrition  refers  to  "the  loss  of  any  language  or  any 
portion  of  language  by  an  individual  or  a  speech  community" 
(Lambert  &  Freed,  1982,  p.  1).  Four  types  of  language  attrition  are 
generally  recognized:  first  language  (LI)  loss,  second  or  foreign 
language  (L2/FL)  lossi,  death  of  an  entire  language,  and  language 
deterioration  in  neurologically  impaired  patients  or  the  elderly.  The 
focus  of  the  present  paper  will  be  on  L2  attrition. 

Although  L2  attrition  has  recently  gained  attention,  the 
number  of  studies  is  still  limited,  and  there  is  a  need  for  a  theory  or 
model  of  L2  attrition.  The  few  theories  or  models  now  available 
offer  only  descriptions  or  abstract  characterizations  of  linguistic 
behavior  (e.g.,  "generalization,"  "simplification,"  etc.).  These, 
however,  are  merely  useful  metaphors  which,  by  themselves,  do  not 
provide  an  account  of  why  such  behaviors  take  place  (Churchland, 
1986;  Schumann,  1990a).  A  model  of  the  mind/brain  should 
incorporate  neurobiological  reality  in  order  to  provide  a  more 
plausible  explanation  of  the  process  of  language  acquisition  and 
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attrition  (Jacobs  &  Schumann,  1992).  Thus,  an  attempt  is  made 
here  to  demonstrate  that  neurobiology  has  significant  bearing  on  the 
formation  of  a  model  of  language  attrition. 

The  present  paper  (1)  summarizes  key  findings  in  L2 
attrition  research2,  (2)  examines  the  neurobiological  basis  for 
language  attrition,  and  (3)  provides  a  tentative  model  of  language 
attrition.  The  summary  illustrates  what  must  be  accounted  for  in  a 
model  of  language  attrition.  Neurobiology  provides  a  means  to 
account  for  language  attrition.  The  tentative  model  is  a  result  of 
integrating  findings  from  studies  in  L2  attrition,  neurobiology,  and 
psychology. 


SECOND  LANGUAGE  ATTRITION  RESEARCH 


Summary  of  findings  in  L2  attrition  studies 

In  sum,  a  model/theory  of  language  attrition  must  be  able  to 
account  for,  as  least  be  compatible  with,  the  following 
characteristics  observed  in  language  attrition  data: 

(1)  loss  in  reverse  order  of  acquisition; 

(2)  inverse  relationship  between  loss  and  language  proficiency; 

(3)  critical  threshold  of  knowledge  beyond  which  loss  is  less  Ukely; 

(4)  residual  learning  in  the  beginning  of  the  incubation  period; 

(5)  initial  plateau  before  attrition  sets  in; 

(6)  permastore  content  and  the  retention  of  frequent  and/or 

pragmatically  laden  expressions,  regardless  of  lack  of  input 
and  use; 

(7)  attrition  as  a  gradual  process  from  less  accessibility  to  complete 

loss; 

(8)  critical  period  of  language  attrition  around  the  age  of  8  or  9; 

(9)  the  amount  of  L2  use  rather  than  length  of  exposure  as 

determining  language  proficiency  and  degree/rate  of  loss; 

(10)  affect  as  an  indirect  determinant  of  language  use  and  attrition. 

The  following  sections  provide  some  elaborations  of  these 
characteristics. 
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The  Reverse  Order  Hypothesis: 
Last  Learned,  First  Forgotten 

The  process  of  forgetting  a  language  is  often  believed  to  be 
the  undoing  of  the  learning  process.  The  notion  has  been  interpreted 
to  refer  to  two  related  but  different  characteristics  of  language  loss: 
the  "reverse  order  hypothesis"  and  the  "inverse  relation  hypothesis." 
The  reverse  order  hypothesis,  which  comes  from  the  notion  of 
"regression"  in  aphasia.  (Jacobson,  1962),  states  that  attrition  is  the 
mirror  image  of  acquisition,  that  is,  the  last  thing  learned  is  the  first 
to  be  forgotten.  The  hypothesis  may  refer  to  three  different 
linguistic  levels  (De  Bot  &  Weltens,  1991):  (i)  within  skills— within 
phonology,  morphology,  syntax,  lexicon,  etc.  (i.e.,  intra-skills 
level);  (ii)  within  languages— in  acquisition,  perception  precedes 
production,  and  spoken  language  precedes  written  language;  in 
language  loss,  the  sequence  is  reversed  (i.e.,  intra-linguistic  level); 
and  (iii)  between  languages- with  respect  to  the  order  of  acquisition 
and  loss  of  languages  in  multilinguals  (i.e.,  inter-linguistic  level). 

The  reverse  order  hypothesis  at  the  intra-skills  level  has  been 
tested  and  supported  cross-linguistically  (Cohen,  1975;  Herman  & 
Olshtain,  1983,  Olshtain,  1986;  Hansen,  1980,  cited  in  Weltens, 
1987;  Moorcroft  &  Gardner,  1987;  Jordens,  De  Bot  &  Trapman, 
1989;  and  Olshtain,  1989).  These  studies  investigated  the  use  of 
specific  grammatical  structures  by  L2  learners  and  generally  found 
that  forms  learned  latest  were  lost  first.  The  generalizability  of 
reverse  order  at  the  intra-level  skills  level  is  limited,  however, 
because  the  hypothesis  has  been  tested  only  on  a  limited  number  of 
specific  syntactic  structures. 

The  reverse  order  hypothesis  at  the  intra-linguistic  level  has 
been  tested  as  well  (Kennedy,  1932;  Geoghegan,  1950;  Scherer, 
1957;  Smythe,  Jutras,  Bramwell  &  Gardner,  1973;  Cohen,  1974; 
Bahrick,  1984a,  1984b;  Moorcroft  &  Gardner,  1987;  Weltens, 
1987;  Van  Els  &  Weltens,  1989;  Weltens  &  Cohen,  1989;  Weltens, 
De  Bot  &  Schils,  1989;  Yoshida,  1989;  Seya,  1990;  Yoshida  & 
Aral,  1990).  The  majority  of  these  studies  assessed  language  skills 
before  and  after  a  three  month  summer  vacation  and  analyzed  the 
types  and  amount  of  skills  lost  as  a  result  of  the  interval.  Results  are 
mixed,  perhaps  because  relatively  little  loss  takes  place  during  such 
a  short  interval.  There  are  indications  of  "residual  learning"  (Cohen, 
1975;  Scherer,  1957;  Weltens  et  al.,  1989),  a  phenomenon  whereby 
incorrect  patterns  become  "unlearned"  (e.g.,  hypercorrection  of  a 
certain  form  disappears)  after  L2  use  is  discontinued.  Nevertheless, 
the  reverse  order  hypothesis  at  the  intra-linguistic  level  appears  to  be 
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generally  supported:  phonological  skills  are  retained  better  than 
lexical  and  grammatical  skills,  receptive  (i.e.,  listening,  reading) 
skills  better  than  productive  (i.e.,  speaking,  writing)  skills, 
metalinguistic  skills  better  than  linguistic  skills,  listening  skills  better 
than  reading  skills,  and  semantics/vocabulary  better  than 
syntax/grammar.  The  reverse  order  hypothesis  at  the  inter-linguistic 
level,  compared  with  the  intra-skills  level  and  the  intra-linguistic 
level,  remains  largely  unexplored. 

The  Inverse  Relation  Hypothesis: 
Better  Learned,  Better  Retained 

The  inverse  relation  hypothesis  postulates  that  there  is  an 
inverse  relationship  between  proficiency  level  prior  to  the  onset  of 
attrition  and  the  rate  and/or  the  amount  of  loss.  In  other  words, 
what  is  learned  best  is  least  forgotten,  and  those  who  have  learned 
better,  or  become  more  proficient,  are  less  vulnerable  to  loss.  The 
hypothesis  has  been  supported  by  several  studies  (Godsall-Myers, 
1981;  Bahrick,  1984a;  1984b;  Moorcroft  &  Gardner,  1987). 

The  influence  of  L2  proficiency  on  the  order  of  loss  has  been 
observed  in  studies  which  found  that  beginning  students  lose  more 
grammar  than  vocabulary,  while  advanced  students  lose  more 
vocabulary  than  grammar  (Moorcroft  &  Gardner,  1987;  Weltens, 
1987;  Van  Els  &  Weltens,  1989;  Weltens  &  Cohen,  1989;  Weltens 
et  al.,  1989).  Moorcroft  and  Gardner  (1987)  attribute  this  finding 
to  the  degree  of  stability  of  L2  knowledge.  They  argue  that  less 
proficient  learners  have  a  relatively  unstable  knowledge  of  grammar 
and,  therefore,  are  more  likely  to  lose  recently  learned  grammatical 
structures  than  vocabulary.  In  comparison,  more  proficient  learners 
have  a  relatively  stable  knowledge  of  grammar,  which  is  learned 
first,  and  a  larger  among  of  lexical  knowledge.  They  are,  thus, 
more  apt  to  lose  vocabulary.  Observations  in  LI  loss  support  such 
claims,  since  the  lexicon  is  affected  in  LI  loss  more  so  than 
grammar.  Presumably,  native  speakers  have  a  complete  mastery  of 
LI  grammar. 

The  level  of  L2  stability  has  been  claimed  to  affect  the  degree 
of  attrition  as  well  (Olshtain,  1986;  De  Bot  &  Clyne,  1989). 
Neisser  (1984)  proposes  that  there  is  a  "critical  threshold"  of 
language  proficiency  level  beyond  which  language  skills  become 
less  vulnerable  to  attrition.  What  Pan  and  Berko-Gleason  (1986) 
call  the  "critical  mass  of  language  that,  once  acquired,  makes  loss 
unlikely"  (p.  204)  seems  to  refer  to  an  identical  notion.  The  notion 
is  in  line  with  studies  which  report  an  "initial  plateau"  or  "a  period  of 
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a  few  years  during  which  skills  are  relatively  unaffected  before 
attrition  actually  sets  in"  (Weltens  &  Cohen,  1989,  p.  130), 
especially  in  the  case  of  high  proficiency  L2  learners  (Weltens  & 
Van  Els,  1986;  Weltens,  1989;  Snow,  Padilla,  &  Campbell,  1984 
and  Schumans,  Van  Els,  &  Weltens,  1985,  both  cited  in  Weltens  & 
Cohen,  1989).  In  other  words,  a  learner  who  has  reached  the 
critical  threshold  of  L2  proficiency  is  more  likely  to  exhibit  some 
resistance  to  attrition,  especially  in  the  early  stages  of  disuse.  It  is 
important  to  note  that  the  level  of  proficiency  is  not  necessarily  the 
consequence  of  length  of  exposure/training  (Kennedy,  1932; 
Flaughter&  Spencer,  1967). 

Not  all  of  the  studies  mentioned  above  explicitly  test  the 
reverse  order  hypothesis  or  the  inverse  relation  hypothesis,  and 
scarcely  any  study  has  tested  both  hypotheses  together. 
Nevertheless,  the  literature  taken  together  seems  to  imply  that  the 
two  hypotheses  capture  the  main  linguistic  characteristics  of 
language  attrition.  It  may  be  that  the  two  hypotheses  refer  to 
separate  processes  taking  place  in  language  attrition,  since  "what  is 
best  learned,  whether  early  or  late  in  the  acquisitional  history"  may 
"be  last  lost"  (Berko-Gleason,  1980,  p.  8  cited  in  Freed,  1982).  In 
the  present  study,  however,  the  two  processes  are  viewed  to  be 
outcomes  of  identical  biological  mechanisms  that  underlie  language 
acquisition  and  attrition. 

The  Effect  of  L2  Use  on  Attrition 

and  the  Existence  of  Permastore  L2  Knowledge 

Although  degree  of  attrition  is  generally  a  function  of  the 
length  of  L2  disuse,  there  are  certain  linguistic  elements  that  survive 
loss  regardless  of  lack  of  practice  (Herman  &  Olshtain,  1983; 
Bahrick,  1984a;  1984b;  Moorcroft  &  Gardner,  1987;  Weltens  1987; 
Lambert,  1989;  Nakazawa,  1989a,  Yoshida,  1989;  Nakazawa  & 
Yoshitomi,  1990;  Seya,  1990;  Yoshida  &  Arai,  1990;  De  Hot, 
Gommans,  &  Rossing,  1991).  These  elements  include  listening 
comprehension,  phonology,  and  metalinguistic  skills  in  general  as 
well  as  very  frequent  and/or  pragmatic-laden  items  such  as  closed 
class  vocabulary,  idioms/fixed  expressions,  and  interjections  and 
fillers  (e.g.,  um). 

Bahrick  (1984a,  1984b)  found  that  while  a  large  portion  of 
Spanish  knowledge  is  lost  within  a  few  years  after  the  termination  of 
training,  the  remainder  is  immune  to  further  losses  for  as  long  as  25 
years.  Much  of  that  content  survives  50  years  or  more.  Bahrick 
calls  this  "portion  of  knowledge  with  a  life  span  in  excess  of  25 
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years"  "the  permastore  content"  (1984a,  p.  Ill),  and  concludes  that 
(1)  a  large  amount  of  information  can  survive  in  the  permastore  with 
minimum  rehearsals  during  the  interval,  (2)  the  amount  of  content  in 
permastore  is  a  function  of  the  level  of  training  (i.e.,  length  of 
training,  final  course  level  and  grade),  and  (3)  a  large  proportion  of 
semantic  knowledge  (especially  receptive  vocabulary)  is  retained  in 
permastore-content. 

Language  Attrition,  Aphasia,  Dementia, 
and  the  Speech  of  the  Elderly 

Although  a  parallel  between  language  attrition  and  aphasia 
claimed  by  Jakobson  (1962)  has  been  challenged  (Caramazza  & 
Zurif,  1978;  De  Bot  &  Weltens,  1991),  impressive  similarities 
between  language  attrition  and  dementia,  and  between  language 
attrition  and  the  speech  of  the  elderly  have  been  suggested  (Obler, 
1982;  Obler  &  Albert,  1989).  Linguistic  elements  which  are 
particularly  robust  include  items,  such  as  function  words,  certain 
overleamed  sequences/automatic  speech,  as  well  as  emotion-laden 
items,  proverbs/idioms,  and  metalinguistic  knowledge.^ 

Attrition  as  Complete  Loss  Versus  Decreased 
Accessibility 

Aside  from  the  retention  of  certain  permastore-content,  and 
following  the  initial  resistance  to  attrition  (i.e.,  initial  plateau, 
residual  learning)  in  the  case  of  proficient  learners,  the  attrition 
process  exhibits  a  normal  forgetting  curve,  involving  a  large  loss 
followed  by  a  more  gradual  loss  (Godsall-Myers,  1981;  Yoshida, 
1989).  Attrition  may  not  necessarily  refer  to  complete  loss  of 
skills/items  but  to  difficulty  in  retrieving  them.  Evidence  of  retrieval 
difficulty  is  observed  in  strategies  adopted  by  people  suffering 
language  loss,  such  as  "progressive  retrieval"  of  lexical  items, 
where  people  start  with  an  inappropriate  choice  of  a  word  and 
eventually  arrive  at  the  correct  one  (Sharwood-Smith,  1983;  Cohen, 
1986,  1989;  Olshtain,  1989),  and  circumlocution  as  a  means  of 
avoiding  words  which  have  become  less  accessible  (Olshtain  & 
Barzilay,  1991;  Turian  &  Altenberg,  1991).  Better  performance  on 
recognition  tasks  than  on  recall  tasks  (Bahrick,  1984a,  1984b)  also 
implies  that  items  have  not  been  completely  lost  from  memory. 
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Critical  Period  of  Language  Attrition 

Age  and  cognitive  development  are  very  likely  to  have 
significant  effects  on  language  attrition.  The  limited  number  of 
studies  to  appear  so  far  imply  that  children  older  than  9  years  of  age 
suffer  less  language  loss,  especially  if  they  have  reached  a  certain 
stability  in  L2  knowledge  (Herman  &  Olshtain,  1983;  Olshtain, 
1986,  1989;  Cohen,  1989,  1989;  Yoshida,  1989;  Yoshida  &  Aral, 
1990). 

Herman  and  Olshtain  (1983),  Olshtain  (1986),  and  Olshtain 
(1989)  are  longitudinal  studies  which  examined  the  attrition  of 
English  as  an  L2  of  Hebrew- speaking  returnee  children,  aged  5  to 
14.  The  younger  children,  aged  5  to  8  years  old,  exhibited  a 
reversal  process  of  acquisition  in  their  uses  of  irregular  noun  plural 
forms  and  verb  past  forms,  while  the  older  children  did  not 
(Olshtain,  1989).  Olshtain  (1986)  suggests  that  the  older  children's 
knowledge  of  irregular  forms  had  reached  a  level  of  stability  which 
reduced  the  possibility  of  losing  them  despite  the  lack  of  positive 
feedback.  Speaking  and  writing  skills  were  also  lost  most 
significantly  with  the  younger  subjects  (Herman  &  Olshtain,  1983). 

Studies  by  Yoshida  (1989)  and  Yoshida  and  Aral  (1990), 
which  investigated  the  attrition  of  English  as  an  L2  of  Japanese 
returnee  children  aged  6  to  15,  also  imply  that  age  might  influence 
the  degree  and/or  rate  of  attrition.  Although  there  was  a  tendency 
for  L2  speaking  skills  to  decline  as  a  function  of  length  of  non-use, 
children  over  8  years  of  age  generally  outperformed  the  younger 
children  in  terms  of  vocabulary  use,  utterance  number,  length,  and 
complexity,  regardless  of  the  length  of  interval.  Hetter  retention  of 
productive  vocabulary  in  oral  language  by  a  13-year-old  is  reported 
in  Cohen  (1989)  as  well.  The  younger  child  of  age  9  exhibited 
greater  loss  both  in  the  types  and  tokens  of  the  vocabulary  produced 
during  a  storytelling  task. 

Studies  in  psycholinguistics  and  neuroscience  suggest  that 
there  are  maturational  constraints  on  language  acquisition(Oyama, 
1976,  1978;  Johnson  &  Newport,  1989;  Long,  1990).  From  an 
extensive  review  of  the  literature.  Long  (1990)  concludes  that  there 
seem  to  be  multiple  sensitive  periods  to  language  acquisition.  The 
sensitive  period  for  acquisition  of  native-like  phonology  ends  at 
about  5  years  of  age,  while  the  sensitive  period  for  acquisition  of 
native-like  syntactic  knowledge  ends  at  about  15  years  of  age. 
Different  rates  and  degrees  of  loss  found  across  linguistic  levels 
(i.e.,  reverse  order  hypothesis  at  the  intra-linguistic  level)  may  be  a 
consequence  of  such  multiple  sensitive  periods. 
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Research  on  children  raised  in  the  wild,  deaf  children  and 
aphasics  also  support  the  notion  of  maturational  constraints.  Curtiss 
(1981),  in  her  study  of  the  literature  on  isolated  children,  observed 
that  children  whose  approximate  age  at  discovery  was  over  8-10 
years,  developed  little  or  no  language  (i.e.,  syntax  and  function 
word  vocabulary),  studies  on  the  acquisition  of  American  Sign 
Language  suggest  that  there  is  a  critical  period  for  acquiring  the 
morphological  system  of  sign  language,  which  may  end  as  early  as 
7  years  of  age  (Newport  &  Meier,  1985;  Newport  &  Supalla, 
1990).  Moreover,  Curtiss  (personal  communication)  suggests  that 
the  critical  period  for  language  acquisition  may  be  earlier  than 
puberty,  around  8  to  10  years  of  age.  Taken  together,  studies  cited 
in  this  section  indicate  the  possibility  of  a  common  critical  period  for 
language  (especially  productive  syntactic  skills),  generally  occurring 
by  the  end  of  the  first  decade  of  life. 

Affect  and  L2  attrition 

Gardner  and  his  colleagues  maintain  that  the  use  of  L2 
during  the  incubation  period  (i.e.,  the  period  between  the 
termination  of  language  training  and  the  time  when  retention  is 
assessed)  is  crucial  for  retention,  and  that  motivation  plays  a 
mediating  role  to  enhance  both  the  initial  language  acquisition  and 
the  use  of  L2  during  the  incubation  period  (Gardner,  1982;  1985; 
Gardner,  Lalonde,  Moorcroft  &  Evers,  1985;  Gardner,  Lalonde  & 
MacPherson,  1987;  and  Gardner  &  Lysynchuk,  1990).  Other 
studies  also  stress  the  effect  of  language  use  (Edwards,  1976;  1977) 
and  motivation  (Nakazawa,  1989)  on  the  degree  of  L2  loss.  Social- 
psychological  variables,  which  are  crucial  to  the  acculturation  model 
in  SLA  (Schumann,  1978),  are  argued  to  be  some  of  the  main 
determinants  of  language  loss  as  well  (Olshtain,  1989;  Olshtain  & 
Barzilay,  1991). 

It  is  interesting  to  note  that  items  with  great  pragmatic  load 
(i.e.,  "the  extent  to  which  the  feature  normally  convey[s]  extra- 
linguistic  information  such  as  affect  or  important  status 
relationships")  (Lambert,  1989,  p.  8),  including  idiomatic 
expressions  and  social  fillers,  are  less  likely  to  be  lost  once  they  are 
learned  (Herman  &  Olshtain,  1983;  Lambert,  1989;  Yoshida,  1990; 
De  Bot  &  Weltens,  1991).  Schatz  (1989)  proposes  that  such  items 
signal  the  L2  learner's  assimilation  to  the  target  culture.  They  may 
also  be  regarded  as  expressions  that  help  the  learners  maintain  real- 
life,  social-communicative  interaction. 
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NEUROBIOLOGICAL  SUPPORT 
OF  LANGUAGE  ATTRITION  PHENOMENA 


Neural    Plasticity    as    the    Mechanism   of   Learning   and 
Forgetting 

The  neurobiological  basis  of  learning  and  memory  is  claimed 
to  be  a  consequence  of  neural  plasticity,  the  adaptive  capacity 
characteristic  of  biological  organisms  (Squire,  1985).  An  organism 
"can  modify  its  nervous  system"  and  "later  behave  differently  as  a 
consequence  of  these  modifications"  (p.  295). 

It  has  long  been  known  that  animals  raised  in  enriched, 
complex  environments  perform  better  on  various  behavioral  tasks 
than  those  raised  in  non-enriched,  simple  environments.  Such 
differences  in  behavioral  performance  seem  to  result  from 
neurobiological  alterations  (e.g.,  changes  in  gross  morphology, 
brain  weight,  cortical  histology,  neurophysiology,  neurochemistry, 
and  dendritic  branching)  induced  by  environmental  factors 
(Diamond,  1988;  Jacobs,  1991;  Jacobs  &  Schumann,  1992;  Jacobs 
&  Scheibel,  In  FYess). 

Studies  on  the  effect  of  an  enriched  environment  are 
gradually  being  extended  to  humans.  Dendritic  branching  analyses, 
which  are  claimed  to  reveal  the  cortical  consequences  of  formal 
learning  (Holloway,  1966),  have  indicated  that  the  quantity  of 
dendrites  in  the  gray  substance  of  the  human  brain  and  spinal  cord 
(i.e.,  dendritic  neuropil)  is  a  possible  function  of  education  level 
and/or  predominant  lifetime  experiences  such  as  occupation 
(Scheibel,  Conrad,  Perdue,  Tomiyasu  &  Wechsler,  1990;  Jacobs, 
1991;  Jacobs,  Schall  &  Scheibel,  In  Press).  Idiosyncrasies  found  in 
cortical  topographical  representations  are  largely  dependent  on 
anatomical  changes  reflecting  individual  experience  (Merzenich, 
Recanzone,  Jenkins  &  Grajski,  1990). 

Recent  advances  indicate  that  information  is  contained  in 
connections  between  neurons  (i.e.,  nerve  cells),  and  that  use- 
dependent  anatomical  change  in  such  connections  is  a  possible 
substrate  for  enduring  increases  in  synaptic  connectivity  essential  for 
memory  storage  (Squire,  1985).  Neuronal  connectivity  involves 
dynamic  processes  such  as  cooperation,  competition,  and 
reorganization  among  neural  elements.  The  formation  and 
modification  of  connectivity  occur  throughout  life.  The  wiring  of 
the  brain  is  only  roughly  aligned  prenatally,  during  which  the  target 
of  synaptic  connections  is  defined  genetically.  The  process  of  fine 
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tuning  continues  well  after  birth,  but  becomes  dependent  on  the 
specific  interactions  between  the  organism  and  its  environment 
(Kandel  &  Jessell,  1991).  Increases  is  synaptic  strength  in  the 
mature  nervous  system  are  typically  accompanied  by  decreases  in 
the  strength  of  competing  connections  (e.g.,  competitive  changes  in 
axons  occurring  in  the  representation  of  the  hand  in  adult 
sensorimotor  cortex  due  to  experience-based  variables)  (Purves  & 
Lichtman,  1980;  Wiesel,  1982;  Jenkins  &  Merzenich,  1984;  Squire, 
1986).  There  are  biological  constraints,  however,  which  limit  the 
effects  of  environmental  factors.  Thus  the  process  is  both 
experience-based,  genetic  and  epigenetic  (Jacobs,  1988; 
Kupfermann,  1991a). 

Memory  Consolidation  and  Different  Sites  of  Memory 

Reorganization  of  connections  that  result  from  neuronal 
plasticity  involves  time.  The  idea  that  changes  in  memory  storage 
occur  across  time  was  proposed  by  Bumham  (1903)  ,  and  is 
commonly  referred  to  as  "memory  consolidation"  (Squire,  1985). 
Squire  (1985)  proposes  that  there  may  be  two  subtypes  of  long-term 
memory:  "intermediate  memory,"  which  is  relatively  sensitive  to 
disruption,  and  a  relatively  permanent  memory.  According  to 
Squire  (1985),  intermediate  memory  may  be  stored  in  the  lateral  part 
of  the  brain  close  to  the  midline  (i.e.,  medial  temporal  lobe), 
whereas  permanent  memory  may  be  memory  transformed  into  the 
cortex  on  the  external  surface  of  the  brain  (i.e.,  neocortex). 

Recent  studies  in  neurobiology  indicate  that  memory  of 
stimuli  is  stored  as  changes  in  the  same  neural  systems  which 
participate  in  the  stimuli's  perception,  analysis,  and  processing 
(Squire,  1986).  In  other  words,  the  processing  site  of  certain 
information  becomes  the  memory  storage  site  of  that  information,  at 
least  temporarily.  This  claim  is  based  on  data  from  patients  with 
brain  lesions.  Lesions  resulting  in  the  loss  of  previously  acquired 
information  also  impair  the  ability  to  re-acquire  the  same 
information.  In  complex  processing,  the  respective  information  is 
assembled  further  by  other  higher-order  systems  (e.g.,  associative 
areas),  resulting  in  the  participation  of  a  large  amount  of  tissue 
without  any  redundancy  or  reduplication  of  function  across  levels. 
The  distribution  of  memory  storage  sites  will  depend  on  the  nature 
(e.g.,  complexity)  of  the  information  to  be  learned  (Squire,  1985). 

For  example,  in  visual  perception  and  memory,  information 
about  form  is  conveyed  through  different  visual  pathways  from 
those  that  convey  information  about  color.    The  cells  in  different 
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visual  pathways  show  different  selectivities  and  respond  only  to 
certain  visual  stimulus  parameters,  such  as  shape,  form,  size,  and 
direction  of  movement  (Kandel,  1991).  Information  about  particular 
features  of  the  stimulus  converges  in  a  higher-level  visual 
processing  region  of  the  cortex  (i.e.,  the  inferior  temporal  cortex) 
that  is  thought  to  integrate  such  individual  infonnation  into  one 
representation  (Squire,  1985).  Thus,  the  storage  of  temporary 
information  may  occur  in  each  cortical  processing  system,  which  is 
later  associated  at  a  different  cortical  region  responsible  for  higher- 
level  (e.g.,  cognitive)  functions.  It  is  now  believed  that  such 
parallel  processing  and  modular  organization  is  present  in  all 
sensory  cortices  (Mason  &  Kandel,  1991).  Memory,  therefore,  "is 
localized  in  the  sense  that  particular  brain  systems  represent  specific 
aspects  of  each  event,  and  it  is  distributed  in  the  sense  that  many 
neural  systems  participate  in  representing  a  whole  event"  (Squire, 
1986,  p.  1613). 

After  immediate  or  short-term  storage  is  accomplished  at 
respective  processing  sites,  changes  in  synaptic  efficacy  (i.e., 
consolidation)  take  place  to  form  long-term  memory.  The 
transformation  of  worlang  memory  into  long-term  memory  occurs  at 
the  higher  stations  of  modality- specific  and  polymodal  sensory 
systems,  and  subsequently  at  the  medial  temporal  and  diencephalic 
regions  of  the  brain  (i.e.,  areas  located  between  the  cerebral 
hemispheres  and  the  mid-brain),  which  assemble  the  temporary 
information  from  neurons  of  respective  memory  storage  sites  located 
elsewhere  in  the  brain  (Squire,  1985).  The  memory  system  in  the 
medial  temporal  lobe  consists  of  the  hippocampus  and  its  adjacent, 
anatomically  related  corticies,  which  together  play  an  essential  role  at 
the  time  of  learning  in  establishing  long-term  memory,  by  binding 
the  distributed  sites  of  memory  storagei  and  maintaining  the 
coherence  of  the  whole  representation.  Knowledge,  after 
reorganization  and  consolidation,  eventually  becomes  stored  in  the 
neocortex,  which  has  the  effect  of  freeing  the  medial  temporal  lobe 
system  for  further  learning  or  acquisition  of  new  information 
(Squire  &  Zola-Morgan,  1991). 

In  amnesia,  premorbid  memory  is  considered  vulnerable 
unless  it  has  been  consolidated  and  has  become  independent  of  the 
medial  temporal  region  (Squire,  1985).  Likewise,  language 
knowledge  that  has  not  been  integrated  through  competition  and 
consolidation  into  permanent  memory,  is  presumably  more  likely  to 
erode  in  language  loss. 
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Neural   Plasticity   and   Consolidation   as   Neurobiological 
Support  of  the  Reverse  Order  Hypothesis 

Taken  together,  neural  plasticity  and  consolidation  of 
connections  seem  to  provide  a  plausible  explanation  for  the  linguistic 
characteristics  observed  in  language  attrition.  The  reverse  order 
hypothesis  and  the  inverse  relation  hypothesis  are  possible  outcomes 
of  the  same  biological  characteristic,  neural  plasticity,  that  governs 
changes  in  linguistic  knowledge.  Input  frequency,  an  environmental 
factor,  modifies  the  linguistic  knowledge,  which  is  presumably 
contained  in  the  dynamic  connections  of  neurons.  Linguistic  input 
is  processed  in  parallel  and  in  a  modular  manner  in  various  sensory 
cortices,  assembled  in  the  medial  temporal  lobe,  and  eventually 
stored  in  the  neocortex  as  consolidation  takes  place.  Items  which 
are  frequent  in  the  input  and  learned  during  the  earlier  stages  of 
acquisition  are  more  likely  to  have  been  consolidated  and  transferred 
to  the  neocortex.  Infrequent  and/or  later  acquired  items,  on  the  other 
hand,  are  presumably  located  in  connections  in  the  medial  temporal 
lobe,  which  are  more  vulnerable  to  loss.  In  essence, 
information/knowledge  is  more  susceptible  to  attrition  in  the  order  in 
which  it  is  stored  in  working  memory  (i.e.,  modality-specific 
processing  sites),  intermediate  memory  (i.e.,  medial  temporal 
region)  and  finally,  permanent  memory  (i.e.,  neocortex). 

Decreased  Accessibility  as  a  Consequence  of  the  Gradual 
Weakening  of  Neural  Networks 

When  attrition  occurs,  there  may  be  an  actual  loss  of  some  of 
the  neural  connections  that  originally  represented  acquired 
information.  This  view  is  supported  by  data  on  retrograde  amnesia 
(i.e.,  difficulty  in  retrieving  memories  formed  before  the  onset  of 
amnesia,  as  opposed  to  anterograde  amnesia,  which  involves 
difficulty  in  forming  new  memories).  Many  retrograde  amnesics 
have  access  to  their  remote,  premorbid  memory  but  not  to  memories 
of  things  experienced  some  weeks  or  months  before  the  injury.  In 
addition,  certain  portions  of  retrograde  amnesia  are  irreversible 
(Squire,  1985).  If  we  assume  that  forgetting  in  amnesia  and  normal 
forgetting  share  an  identical  biological  mechanism,  the  same  rule 
could  be  applied  to  language  attrition. 

Language  attrition,  however,  is  a  gradual  and  global 
process,  unlike  amnesia  or  other  abnormalities  induced  by  brain 
lesions  in  which  immediate  and  specific  deficits  of  parts  of  the 
language  system  follow.    Consequently,  people  who  are  in  the 
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process  of  losing  their  LlA-2  may  initially  suffer  from  a  difficulty  in 
retrieving  information  represented  in  weakening  connections  before 
the  information  is  completely  lost.  "Releaming"  of  a  language, 
documented  in  LI  attrition  research  (Herman,  1979)  and  ESL 
research  (Celce-Murcia,  1979,  cited  in  Hatch,  1983)  may  be  the 
result  of  a  "re-strengthening"  of  connections  which  were  not 
completely  lost  during  incubation.  "Progressive  retrieval"  of  lexical 
items  observed  in  language  attrition  may  be  instances  of  weakening 
connections  as  well. 

Attrition  of  Declarative  versus  Procedural  Knowledge 

It  should  be  noted  that  the  medial  temporal  region  seems 
responsible  only  for  memory  of  declarative  knowledge.  According 
to  Squire  (1986),  declarative  knowledge  "is  explicit  and  accessible 
to  conscious  awareness,"  and  contrasts  with  procedural  knowledge 
which  is  implicit,  and  "accessible  only  through  performance,  by 
engaging  in  the  skills  or  operations  in  which  the  knowledge  is 
embedded"  (1986,  p.  1614).  Examples  of  declarative  knowledge 
include  facts,  episodes,  and  lists,  which  can  be  declared. 
Procedural  knowledge,  on  the  other  hand,  includes  simple  forms  of 
associative  learning  such  as  classical  conditioning  and  habit. 
Studies  on  amnesia  have  shown  that  global  amnesic  patients  who 
have  lesions  in  the  medial  temporal  region  are  impaired  in  retrieving 
long-term  declarative  knowledge  and  also  exhibit  problems  in 
acquiring  new  knowledge.  The  amnesic  patients'  short-term 
memory  and  their  memory  for  the  very  remote  past,  presumably 
permanent  memory,  are  spared.  Thus,  the  declarative  knowledge 
lost  in  amnesia  might  be  more  specifically  considered  as  intermediate 
memory.  Such  patients  also  maintain  procedural  memory, 
suggesting  that  procedural  memory  is  a  function  of  different  regions 
of  the  brain,  presumably  the  striatum,  a  complex  of  structures  in  the 
forebrain  (Squire,  1986)  and  in  the  cerebellum  (see  Robbins,  this 
volume). 

If  declarative  knowledge  refers  to  the  ability  to  explicitly 
state  linguistic  rules,  naturalistic  acquirers  of  language  would  have 
littie  or  not  declarative  linguistic  knowledge.  For  them,  linguistic 
knowledge,  except  for  semantic  knowledge  perhaps,  would  be 
procedural  knowledge.  Although  procedural  knowledge  could  be 
represented  declaratively  as  well,  the  two  types  of  knowledge  are 
stored  in  separate  sites  in  the  brain. 

It  is  plausible  to  think,  however,  that  procedural  knowledge 
is  acquired  through  stages  similar  to  consolidation  of  declarative 
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knowledge.  Such  stages  may  involve  increasing  refinement  of  skills 
(Robbins,  this  volume)  and  thus,  show  similar  attrition  patterns  in 
the  reverse  order  of  acquisition,  with  later  stages  "undone"  first. 
The  mechanism  underlying  declarative  and  procedural  acquisition 
and  attrition  are  possibly  identical,  namely,  through  modification  of 
neuronal  connections,  but  the  location  of  alteration  in  the  brain  may 
be  different  in  the  two  types  of  knowledge.  At  present,  however, 
neurobiological  studies  on  non-declarative  knowledge  are  too  scarce 
to  assess  the  tenability  of  such  argument. 

Neurobiological    Support 

of  the  Inverse  Relation  Hypothesis: 

Efficient  Processing  and  Language  Proficiency 

Well-acquired  items  are  presumably  represented  in 
connections  in  the  medial  temporal  lobe  and  the  neocortex.  Hence, 
higher  proficiency  may  mean  a  larger  number  of  consolidated 
connections  in  long-term  storage.  Alternatively,  higher  proficiency 
may  refer  to  more  efficient  connections.  In  fact,  there  are  studies 
which  show  that  people  with  high  scores  on  intelligence  tests  require 
less  brain  energy  than  people  with  lower  scores.  Using  positron 
emission  tomography  (PET)  scans,  Haier  et  al.  (1988)  measured  the 
intensity  of  brain  activity  by  recording  the  amount  of  an  injected 
substance  (a  radioactively  tagged  glucose  compound)  absorbed  by 
brain  cells  while  subjects  engaged  in  cognitive  tasks.  Novel  tasks 
increased  the  amount  of  energy  consumption,  but  with  accumulated 
practice,  brain  metabolism  decreased  significantly.  Furthermore,  the 
brain  metabolism  of  proficient  task  performers  was  found  to  be  the 
least  active.  This  finding  implies  that  highly  proficient  L2  learners 
may  be  energy-efficient  as  well. 

Interestingly,  Haier  and  his  colleagues  noted  that  a  few  areas 
of  the  brain,  including  the  hippocampus,  were  activated  more  after 
practice.  This  finding  reconfirms  the  role  of  the  hippocampus  as 
one  related  to  memory  and  learning.  Furthermore,  it  suggests  that 
more  proficient  learners  are  able  to  utilize  information  carried  in  the 
neuronal  networks  in  long-term  storage,  and  hence,  are  able  to 
minimize  the  activation  of  networks  in  the  initial  processing  areas  in 
order  to  accomplish  the  task. 

The  brain  analyzes  new  stimuli  by  comparing  it  with  earlier 
acquired  information  and  stores  the  new  information  in  accordance 
with  its  similarities  and  differences  to  previous  memories  of  the 
same  type.  The  analysis  consists  of  pattern  detection  and 
categorization,  which  is  an  important  feature  of  consolidation 
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(Guyton,  1987;  Jacobs,  1991).  It  follows  from  this  that  the  richer 
"internal  context"  (i.e.,  prior  knowledge)  (Jacobs,  1988)  one  has, 
the  more  efficient  the  interaction  with  new  stimuli  will  be.  Greater 
efficiency  implies  the  ability  to  neglect  irrelevant  information  and 
thus,  expend  less  brain  energy. 

In  sum,  proficient  L2  learners  may  have  a  rich  connection  of 
networks  in  long-term  memory  which  enables  energy-efficient 
processing  when  acquiring  new  L2  knowledge.  The  neuronal 
connectivity  in  long-term  memory  is  better  developed  and  more 
extensive,  thus,  less  immune  to  loss.  In  the  same  vein,  the  critical 
threshold  of  L2  knowledge  may  correspond  to  a  certain  amount  and 
strength  of  storage  in  long-term  memory  which  enable  acquisition  to 
be  more  energy  conserving. 

Neurobiological  Support  of  Residual  Learning,  Initial 
Plateau,  and  Permastore 

Since  consolidation  of  memory  takes  time,  language  skills 
may  be  performed  better  after  a  certain  lapse  of  time.  For  this 
reason,  residual  learning  may  be  observed  even  when  language  use 
is  discontinued.  However,  once  new  information  has  been 
reorganized,  lack  of  input  will  result  in  the  weakening  of 
connectivity.  The  attrition  curve,  hence,  exhibits  an  "initial  plateau," 
or  resistance  to  loss,  followed  by  a  more  rapid  and  then  gradual  loss 
as  language  disuse  continues.  At  least  some  portion  of  the 
information  which  has  successfully  entered  permanent  memory  can 
be  retained  in  spite  of  minimal  use.  Thus,  the  existence  of 
"permastore"  seems  possible.  Other  information  can  still  be  retained 
if  language  is  used,  or  the  information  is  retrieved  frequently 
enough,  to  maintain  the  strength  of  the  connections.  Otherwise, 
lack  of  input  will  result  in  the  regression  of  morphological 
alterations  in  connectivity  representing  linguistic  knowledge.  Such 
regression  has  been  observed  with  animals  raised  in  an  enriched 
environment  and  then  put  in  an  impoverished  environment 
(Diamond,  1988). 

Neurobiological  Support  of  the  Critical  Period  of 
Language  Attrition 

It  has  been  observed  that  different  regions  of  the  brain  have 
different  critical  periods  of  development  (Kandel  &  Jessell,  1991). 
For  example,  the  development  of  form  perception  and  the  binocular 
vision  necessary  for  depth  perception  proceed  in  stages  after  birth. 
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each  stage  being  irreversible.  Appropriate  sensory  experiences  are 
essential  for  normal  developmental  processes  to  occur.  Hence, 
deprivation  of  appropriate  stimulus  input  during  the  postnatal  period 
when  developmental  decisions  are  being  made  seems  to  have  the 
most  severe  consequences  on  the  maturation  of  the  nervous  system 
in  question.  Although  it  has  been  difficult  in  the  past  to  relate  the 
development  of  behavior  to  the  development  of  the  nervous  system, 
studies  on  the  maturation  of  neurons  in  the  visual  cortex  have  come 
to  provide  an  important  bridge  between  behavior  and  the  nervous 
system  (Kandel  &  Jessell,  1991).  Discrete  stages  in  the  formation 
of  other  nervous  systems  including  those  related  directly  to  various 
linguistic  skills  could  ultimately  become  clearer  as  well. 

The  dynamic  changes  of  human  dendritic  systems  triggered 
by  environmental  factors  have  been  mentioned  above.  Dendritic 
change  has  also  been  observed  to  be  age  related.  Jacobs  and 
Scheibel  (In  Press)  report  age  related  decreases  in  the  length  of 
dendrites  emerging  from  the  base  of  the  cell  body  (i.e.,  basilar 
dendrites)  and  dendritic  laterality,  loss  of  dendritic  neuropil, 
especially  in  the  dendritic  branches  distant  from  the  cell  body  (i.e., 
distal  segments),  and  increases  in  the  number  of  basilar  dendritic 
segments.  The  study  found,  interestingly,  that  the  total  dendritic 
length  of  tissue  from  the  Wernicke's  area  of  nine-year  olds  was 
significantly  longer  than  that  of  any  of  the  adult  subjects.  Jacobs 
and  Scheibel  suggest  that  major  dendritic  development  takes  place 
through  the  first  decade  of  life,  followed  by  a  gradual  decline  of 
proliferation  that  approximates  adult  values  late  in  the  second 
decade.  Furthermore,  Chugani,  Phelps,  and  Mazziotta  (1987)  point 
out  that  cerebral  metabolism  (i.e.,  glucose  utilization)  as  well  as  the 
size  of  the  major  projection  neurons  in  the  cerebral  cortex  (i.e. 
pyramidal  cells),  which  are  indicators  of  neuronal  activity,  appear  to 
show  parallel  age-related  changes,  reaching  maximum  measure 
roughly  by  the  age  of  10.  The  correspondence  between 
neurobiological  maturation  and  the  alleged  critical  period  for 
language  attrition  is  compelling. 

Neurobiological  substrate  of  affect 

The  role  of  affect  in  L2  acquisition  and  attrition  appears  to 
have  a  neurobiological  correlate.  The  almond-sized  mass  of  nuclei 
called  the  amygdaloid  complex  in  the  limbic  system  (i.e.,  the  border 
area  arranged  around  the  junction  of  the  cerebral  hemisphere  with 
the  brain  stem),  which  is  located  near  the  tip  of  the  temporal  lobe, 
has  been  found  to  be  the  mediator  of  the  association  of  memories 
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formed  through  different  senses.  Although  the  amygdala  in  itself  is 
not  part  of  the  medial  temporal  memory  system,  it  establishes  links 
between  stimuU  through  its  direct  and  extensive  connections  with  all 
the  sensory  systems  in  the  cortex  and  attaches  affective  evaluations 
to  the  stimuli.  Increased  attention  to  stimuli  as  a  result  of  the 
evaluation  should  lead  to  more  efficient  information  processing. 
Thus,  it  is  thought  that  the  amygdala  is  involved  in  "selective 
attention"  needed  for  learning.  It  appears  to  limit  attention  to  stimuli 
with  emotional  significance  (Squire  &  Zola-Morgan,  1991).  The 
suggestion  here  is  that  affect  influences  what  one  perceives,  and 
pays  attention  to,  and  thus,  determines  what  one  learns  (Schumann, 
1990b;  Jacobs  and  Schumann,  1992)  and  retains. 

Selective  attention  is  relevant  in  considering  why  longer 
exposure  to  L2  does  not  necessarily  imply  higher  proficiency  and 
stronger  resistance  to  loss.  Studies  indicate  that  cortical  network 
changes  are  recorded  only  when  input  is  delivered  under  attended 
behavior  (Jenkins,  Merzenich,  Ochs,  Allard  &  Gui-Robles,  1990; 
La  Berge,  1990).  Furthermore,  it  is  essential  for  successful 
language  acquisition  and  retention  that  attention  be  accompanied  by 
sufficient  input  and  use.  Based  on  their  experiment  on  rats, 
Coleman  &  Riesen  (1968)  maintain  that  certain  components  of  an 
intact  central  nervous  system  may  fail  to  develop  normally  as  a 
consequence  of  disuse  or  decreased  input,  though  there  is  a  degree 
of  dependence  on  innate  organization.  Rosenzweig,  Love  and 
Bennet  (1968)  demonstrated  that  even  a  few  hours  a  day  of  removal 
from  impoverished  to  enriched  environment  produced  significant 
changes  in  the  brain  chemistry  and  brain  weight  of  rats. 

These  studies  together  seem  to  support  findings  in  language 
attrition  research  which  indicate  that  affective  factors  and  selective 
attention  resulting  from  affective  evaluation  play  significant 
mediating  roles  in  determining  the  amount  of  language  use  crucial  to 
language  retention. 
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A  MODEL  OF  LANGUAGE  ATTRITION 


Based  upon  the  integration  of  relevant  issues  discussed 
above,  the  following  schematic  representation  is  presented  as  a 
tentative  psychological  model  of  the  process  of  language  attrition. 
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Both  language  acquisition  and  attrition  are  consequences  of 
neural  plasticity.  Neural  plasticity  allows  input  to  alter  the 
configuration  of  existing  knowledge  networks  in  memory  storage. 
New  information  is  compared  with  prior  knowledge  and  stored  in 
matched  patterns.  It  is  first  stored  in  working  memory  via  modality- 
specific  processing  systems,  then  in  intermediate  memory  where 
information  is  integrated  and  associated  with  other  information,  and 
finally  in  permanent  memory.  The  transition  of  information  to  long- 
term  storage  involves  consolidation  which  gradually  strengthens 
certain  connections  and  eliminates  or  weakens  others.  Linguistic 
knowledge  represented  in  eliminated  connections  is  that  which  is 
lost.  Since  memory  in  permanent  storage  has  gone  through 
consolidation,  the  connectivity  is  stronger  and,  thus,  less  vulnerable 
to  attrition.  Vulnerability  to  attrition  is  greatest  with  respect  to 
recently  acquired,  unconsolidated  knowledge.  Information  which 
survives  competition  and  reorganization  becomes  the  basis  for  the 
processing  of  new  information. 


Figure  1.  Neural  plasticity  allows  input  to  alter  the  configuration  of 
existing  knowledge  networks  in  memory  storage.  New 
information  is  compared  with  prior  knowledge  and  stored  in 
matched  patterns.  It  is  first  stored  in  working  memory  via 
modality-specific  processing  systems,  then  in  intermediate 
memory  where  information  is  integrated  and  associated  with 
other  information,  and  finally  in  permanent  memory.  The 
transition  of  information  to  long-term  storage  involves 
consolidation  which  gradually  strengthens  certain 
connections  and  eliminates  or  weakens  others.  Vulnerability 
to  attrition  is  greater  with  respect  to  recently  acquired, 
unconsolidated  knowledge.  Information  which  survives 
competition  and  reorganization  becomes  the  basis  for  the 
processing  of  new  information. 

Affect  restricts  attention  to  relevant  input,  and 
influences  the  amount  of  intake.  It  also  determines  which 
information  is  to  be  stored  in  long-term  memory.  Individual 
differences  in  prior  experience  which  define  the  nature  of 
personal  affective  evaluations  of  stimuli  determine  the 
formation  and  strength  of  connections  (i.e.,  speed  and 
amount  of  learning),  and  may  even  protect  certain 
information  (e.g.,  retention  of  emotionally  laden 
expressions)  during  incubation. 
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The  "boxology"  adopted  in  the  representation  of  the  model  is 
somewhat  misleading  since  it  depicts  the  types  of  memory  storage  as 
being  separate  entities.  The  same  synaptic  connection  within  a  given 
neuronal  network  may  correspond  to  any  length  of  memory  storage, 
depending  on  the  strength  of  the  connections.  The  psychological 
terms  working,  intermediate,  and  permanent  memory,  are 
distinguished,  however,  in  order  to  reflect  the  neurobiological 
findings  which  indicate  presumably  separate  sites  of  storage  roughly 
corresponding  to  the  three  types  of  memory. 

Affect  restricts  attention  to  relevant  input  and  influences  the 
amount  of  intake.  It  also  determines  which  information  is  to  be 
stored  in  long-term  memory.  Individual  differences  in  prior 
experience  which  define  the  nature  of  personal  affective  evaluations 
of  stimuli  determine  the  formation  and  strength  of  connections  (i.e., 
speed  and  amount  of  learning),  and  may  even  protect  certain 
information  (e.g.,  retention  of  emotionally  laden  expressions) 
during  incubation.  Thus,  length  of  exposure  alone  does  not 
necessarily  determine  proficiency  or  achievement. 

Certain  groups  of  connections  form  a  system  which  becomes 
the  unit  for  higher-order  connections.  Different  systems  have 
different  critical  periods  of  development,  and  input  during  such 
periods  has  the  greatest  influence  on  the  nature  and  structure  of 
connections  to  be  formed.  Little  is  retained  if  input  is  received  only 
during  the  immature  stages  of  the  relevant  system  that  processes  and 
stores  that  kind  of  input. 

A  theory  of  language  attrition  should  subsume  or  be 
compatible  with  a  theory  of  language  acquisition  (Schinke-Llano, 
1989).  The  present  model,  accordingly,  assumes  that  the  biological 
mechanism  that  governs  language  acquisition  and  attrition  in  both  LI 
and  L2  or,  for  that  matter,  in  any  language,  is  by  and  large  identical. 
Differences  in  the  characteristics  of  the  actual  behavior  observed  in 
language  acquisition  and  attrition,  or  in  LI  and  L2,  are  mostly  due  to 
the  different  environmental  factors  that  have  contributed  to  the 
formation  of  idiosyncratic  neurobehavior  at  different  developmental 
stages  and  under  different  affective  conditions.  Connections  are 
input-dependent,  but  the  mechanism  that  keeps  records  of  the  input 
and  retains  them  in  long-term  memory  in  association  with  other 
input  items,  features,  or  relationships  may  be  innate. 

Although  the  present  model  is  a  psychological  one,  the 
intention  has  been  to  emphasize  the  significance  of  considering 
available  information  from  neurobiology  and  relating  that  to 
psycholinguistic  research  on  language  attrition.  With  further 
advances   in   neurobiological   studies,   the   construction   of  a 
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neurobiological/neuropsychological  theory  of  language  acquisition 
and  attrition  should  become  feasible. 
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NOTES 


1  The  loss  of  L2/FL  can  further  be  subcategorized  into  two  types;  L2/FL  lost 
in  the  L2  environment  such  as  in  the  case  of  aging  emigrants/immigrants,  and  L2/FL 
lost  in  the  LI  environment  such  as  in  the  cases  of  language  learners  at  school  of 
returnees  from  countries  where  the  FL  is  spoken. 

Language  attrition  studies  may  ultimately  need  to  distinguish  the  two  types 
of  language  learning.  There  is,  however,  not  enough  literature  on  language  attrition 
that  makes  the  distinction  feasible  at  this  point.  Furthermore,  my  view  on  language 
attrition  is  that  the  governing  mechanism  of  linguistic  behaviors  is  identical  whether 
the  language  is  LI,  L2,  or  FL.  I  will  argue  later  in  the  paper  that  what  makes  the 
difference  are  the  cognitive,  developmental,  socio-psychological,  and  environmental 
factors  that  are  concomitant  to  the  L2/FL  distinction.  Thus,  in  the  present  paper,  I  do 
not  intend  to  make  an  explicit  distinction  between  L2  and  FL  loss. 

2  This  section  is  a  condensed  version  of  a  lengthier  and  more 
comprehensive  review  of  literature  of  L2  attrition  summarized  in  Yoshitomi  (1992). 
Some  oversimplifications  of  issues  were  inevitable  due  to  the  specific  focus  of  the 
present  paper  and  space  limitations. 

3  Although  a  more  detailed  discussion  of  the  investigation  of  language 
abilities  in  brain  damaged  persons  in  relation  to  language  loss  would  be  a  useful 
exploration,  this  paper  will  limit  itself  to  a  brief  mention  that  intriguing  similarities 
exist  between  pathological  and  non-pathological  language  less.  Future  research 
should  certainly  address  this  issue  further. 

^  For  more  discussion  on  selective  attention,  see  Sato  and  Jacobs  or  Lem, 
this  volume. 
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Although  purely  empiricist,  or  environment-based,  theories 
of  language  acquisition  suffered  some  serious  setbacks  with  the 
rise  of  generative  grammar,  they  have  very  recently  come  into 
vogue  again  with  a  new  brand  of  cognitive  modeling  known  as 
connectionism.  Connectionism  represents  the  strongest  form  of 
empiricism:  radical  connectionists  typically  argue  that  all  learning 
is  based  on  the  processing  of  input,  and  that  there  is  no  need  to 
posit  any  a  priori  internal  structure  to  the  processing  system  at  all. 
What  linguists  describe  as  rule-governed  behavior,  radical 
connectionists  say  is  only  a  description  of  the  emergent  behavior  of 
the  processor.  Learning  is  simply  a  matter  of  strengthening  and 
weakening  neural  connections  in  response  to  the  statistical 
frequency  of  patterns  in  the  input,  and  the  abstract  symbols  and 
rules  that  are  so  crucial  to  current  linguistic  theory  have  no  place  at 
all  in  a  connectionist  system.  There  have  been  varied  responses  to 
these  strong  claims.  Some  have  been  wildly  enthusiastic  about  the 
new  approach  (e.g.,  Sampson,  1987)  while  others  have  severely 
criticized  many  of  its  claims  (e.g.,  the  papers  in  Pinker  &  Mehler, 
1989).  There  is  also  an  extensive  middle  ground  between  the  two 
extremes  however.  While  "pure"  Parallel  Distributed  Processing 
(PDP)  models  seem  to  work  best  with  problems  involving  motor 
control  or  the  earliest  stages  of  visual  processing,  connectionists 
working  with  more  complex  cognitive  processes  such  as  language 
or  problem  solving  have  often  incorporated  symbols  into  their 
connectionist  architectures  (e.g.,  the  papers  in  Hinton,  1991). 

There  has  recendy  been  some  interest  in  the  applicability  of 
connectionist  models  to  Second  Language  Acquisition  (SLA) 
theory  and  research  (Schmidt,  1988;  Gasser,  1990;  Sokolik, 
1990).  In  the  last  issue  of  ML,  Yas  Shirai  (1992)  added  his  voice 
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to  this  growing  literature  by  arguing  that  the  connectionist 
framework  effectively  explains  LI  transfer  in  SLA.  He  also  argued 
that  since  connectionist  models  present  a  more  neurally  plausible 
model  of  the  lower-levels  (i.e.,  neural  level)  of  language 
processing,  they  may  provide  second  language  researchers  with  a 
new  opportunity  to  look  inside  "the  black  box"  of  language 
behavior.  In  this  special  issue  on  neurobiology  and  language,  it 
seems  particularly  appropriate  to  take  a  closer  look  at  just  how 
neurally  plausible  connectionist  models  are,  and  if  connectionism 
adequately  explains  all  of  the  transfer  phenomena  that  Shirai  claims 
that  it  does.  While  Shirai  and  I  are  in  agreement  on  the  need  for  a 
general  cognitive  model  of  SLA  which  integrates  research  in 
several  fields  (Fantuzzi,  1989,  1990;  Hatch,  Shirai  &  Fantuzzi, 
1990),  I  disagree  that  connectionism  can  yet  explain  the  high-level 
transfer  phenomena  that  Shirai  oudines  in  his  article. 

More  than  simply  reply  to  Shirai's  claims  for  a 
connectionist  explanation  of  language  transfer,  I  will  also  look 
more  closely  at  what  a  connectionist  explanation  of  a  cognitive 
function  entails.  McCloskey  (1991),  for  instance,  argues  that 
"connectionist  networks  should  not  be  viewed  as  theories  of  human 
cognitive  functions,  or  as  simulations  of  theories,  or  even  as 
demonstrations  of  specific  theoretical  points"  (p.  387).  An 
important  question,  then,  is  precisely  what  role  connectionism 
plays  in  the  development  of  cognitive  theories  and  in  the 
explanation  of  linguistic  phenomena,  a  question  which  is  of  course 
frequently  raised  in  the  connectionist  literature  itself  (e.g., 
Smolensky,  1988;  Fodor  &  Pylyshyn,  1988;  Pinker  &  Prince, 
1988;  Minsky  &  Papert,  1988;  McClelland,  1988).  This  requires 
a  closer  and  more  critical  look  at  some  existing  connectionist 
models  of  language  functions  than  Shirai  has  given. 


PRACTICING  CONNECTIONISM  WITHOUT  A 
CONNECTIONIST  MODEL 


Shirai  largely  relies  on  one  particular  implementation  of  a 
connectionist  model  to  support  his  argument  that  the  connectionist 
framework  can  provide  an  explanation  for  language  transfer.  This 
model  is  Gasser's  (1988)  localized  model  of  bilingual  sentence 
production.  However,  he  also  liberally  peppers  his  discussion 
with  references  to  other  models  (including  Gasser,  1990)  and  to  a 
general  "connectionist  framework".    This  reference  to  a  generic 
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framework,  however,  ignores  the  important  differences  between 
the  many  different  types  of  models,  and  evades  deep  discussion  of 
their  actual  capabilities.  Gasser  (1990),  for  example,  explicitly 
points  out  that  connectionist  models  can  not  yet  model  "stages"  of 
acquisition,  or  environmental  factors  or  monitoring,  and  it  is 
unclear  how  they  could.  Yet  Shirai  lists  discourse/pragmatic 
knowledge,  sociolinguistic  context,  learning  environment,  level  of 
proficiency,  markedness,  age,  attention,  and  monitoring  all  as 
conditions  on  transfer  that  the  connectionist  framework  can 
"effectively  explain." 

As  a  representative  sampling  of  Shirai's  arguments,  in  this 
section  I  will  discuss  and  critique  his  proposals  for  modeling  the 
effect  of  the  learning  environment,  level  of  proficiency,  and 
sociolinguistic  context  on  LI  transfer.  Section  2  will  focus  on  the 
general  issue  of  the  role  that  connectionist  models  play  in 
developing  theories  of  cognition.  In  section  3, 1  will  take  a  closer 
look  at  the  claim  of  neural  plausibility,  and  discuss  how 
connectionism  has  been  used  to  model  age-related  conditions  on 
learning.  Section  4  concludes  that  Shirai  has  not  demonstrated 
how  connectionism  may  provide  SLA  researchers  with  new  and 
more  sophisticated  interpretations  of  language  transfer  or  new 
insights  into  the  role  of  Contrastive  Analysis  (CA)  in  predicting 
language  transfer.  In  my  opinion,  the  precise  role  that 
connectionist  models  of  cognition  might  play  in  SLA  research, 
beyond  Gasser's  (1990)  first  noteworthy  attempt,  has  yet  to  be 
articulated. 

Learning  Environment 

Shirai  argues  that  cohnectionism  explains  transfer  in  the 
classroom  setting  in  that  an  acquisition-poor  learning  environment 
tends  to  result  in  a  "grammar-translation"  approach,  which 
"necessitates  that  the  learner  'connect'  LI  to  the  L2."  In  naturalistic 
settings,  learners  who  have  little  knowledge  of  L2  and  must 
communicate  also  "have  to  make  L1-L2  connections  between 
lexical  concepts.  As  a  result  of  this  process,  L1-L2  connections 
become  stronger  and  harder  to  eliminate  later"  (p.  105).  This 
discussion  of  learning  environment  is  typical  of  how  Shirai  treats 
most  of  the  conditions  on  transfer:  there  is  a  strong  connection 
between  LI  lexicon/Ll  concepts  and  L2  lexicon/L2  concepts,  and  if 
LI  concepts  are  activated  while  speaking  L2,  L2  performance  will 
be  influenced  by  LI.  The  connectionist  model  implements  this  via 
a  highly  interconnected  network  of  spreading  activation.    The 
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details  of  the  implementation,  however,  are  not  given,  and  many 
other  questions  remain  unanswered:  if  transfer  is  simply  the 
consequence  of  L1-L2  connections  between  lexical  concepts,  when 
does  transfer  not  occur?  Is  the  strengthening  of  connections  merely 
a  matter  of  stimulus-response?  If  these  strong  connections  are  hard 
to  eliminate,  how  is  the  L2  ever  learned?  While  it  is  an  open 
question  whether  connectionism  can  address  these  issues  or  not, 
possible  answers  could  only  be  contained  in  a  specific  model.  As 
will  be  discussed  further  below,  the  model  that  Shirai  uses  to 
illustrate  his  argument  was  not  designed  to  handle  this  particular 
problem. 

Shirai  himself  points  out  in  another  section  that  L2  learners 
in  both  classroom  and  naturalistic  settings  are  strongly  guided  by 
conscious  strategies,  but  conscious  following  of  rules  is  difficult 
for  connectionist  models  to  handle.  Since  grammar-translation,  as 
much  of  L2  learning,  appears  to  involve  conscious  strategies  and 
rule-application,  it  is  not  clear  just  what  aspects  of  transfer 
connectionism  explains.  Again,  the  assertion  that  transfer  occurs 
because  LI  is  somehow  connected  to  L2  and  activated  with 
language  input  is  abstract  enough  to  be  modeled  in  many  ways.  A 
specific  model  adds  the  crucial  details  of  how  information  may  be 
represented  and  processed,  but  Shirai's  purported  explanation  for 
how  these  conditions  might  be  modeled  is  very  broad  and  vague. 

Level  of  Proficiency 

As  another  condition  on  transfer,  Shirai  notes  that  a  learner 
of  lower  proficiency  must  fall  back  on  LI  syntax.  When  there  is 
little  syntactic  knowledge  of  L2  and  the  speaker  has  to  say 
something,  she  will  simply  plug  L2  words  into  LI  structures,  a 
process  known  as  "relexification."  Transfer  thus  occurs  because 
the  learner  must  activate  her  knowledge  of  LI  in  order  to  produce 
L2.  Here,  Shirai  explicitly  invokes  Gasser's  (1988)  model  to 
explain  the  phenomenon.  In  Gasser's  model,  syntactic  information 
in  LI  is  partly  determined  by  the  lexicon  (as  in  traditional  linguistic 
theory)  and  by  a  "sequencing  component."  The  sequencing 
component  is  not  directly  involved  in  transfer,  but  is  simply 
necessary  for  this  network  to  produce  sentences  at  all.  Therefore, 
relexification  occurs  in  the  model  because  general  linguistic 
structures  exist  and  are  accessed  for  both  LI  and  L2,  and  when  the 
L2  is  less  developed,  LI  structures  are  accessed.  Proposing 
general  syntactic  structure  in  linguistic  representation,  however,  is 
not  unique  to  Gasser's  model  but  is,  of  course,  found  in  many 
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non-connectionist  models  as  well.  Again,  we  have  not  been  shown 
how  connectionism  provides  a  superior  account  of  language 
transfer. 

Sociolinguistic    Context 

Although  he  gives  us  only  a  partial  description  of  a 
particular  localist  model  which  generates  a  few  simple  sentences 
and  cannot  model  all  of  the  transfer  phenomena  he  describes, 
Shirai's  claim  is  that  the  model  could  be  easily  augmented  to  handle 
it.  An  example  is  his  discussion  of  how  "sociolinguistic  context" 
aids  transfer:  "when  the  learner  is  speaking  with  someone  from  the 
same  culture,  the  hearer-role  (represented  as  a  node)  is  specified  as 
such.  Whether  the  learner  likes  it  or  not  (i.e.,  is  goal-driven  or 
not),  the  hearer-role  is  activated,  which  leads  to  a  spreading 
activation  of  the  nodes  connected  to  it.  Thus,  the  model  would  be 
able  to  show  the  kinds  of  adjustment  which  are  called 
'accommodation'"  (p.  109).  An  example  of  accommodation  is  the 
observation  that  native  Chinese  speakers  show  more  LI  influence 
when  speaking  in  Thai  with  other  native  Chinese  speakers  than 
with  a  Thai  speaker.  Shirai  is  not  suggesting  that  the  "hearer-role" 
node  alone  explains  accommodation,  but  there  must  be  many  more 
nodes  connected  to  that  one  which  must  also  be  specified  in  the 
model. 

Since  all  of  the  connections  in  Gasser's  model  have  been 
set  by  hand,  to  augment  it  to  show  language  accommodation  would 
require  specifying  what  the  appropriate  "sociolinguistic  context"  for 
the  transfer  is,  all  of  the  relevant  social  or  cultural  information  that 
the  hearer-role  node  is  connected  to,  and  all  of  the  connection 
weights  as  well,  including  of  course  when  the  hearer-role's  effect 
on  transfer  would  be  overridden  since  the  hearer-role  must  always 
be  activated  in  a  bilingual.  This,  however,  is  not  a  trivial  task, 
considering  that  a  single  sentence  in  the  model  required  292  nodes 
and  1374  connections! 

While  Gasser's  model  might  be  augmented  to  incorporate 
more  complex  information,  it  is  still  not  clear  how  much  the 
transfer  phenomena  would  be  explained  by  it,  since  the  hand-set 
connections  really  instantiate  the  programmer's  assumptions  of 
how  information  should  be  represented  in  order  to  perform  a 
certain  task.  Consider  the  example  of  code-switching  that  Shirai 
uses  to  illustrate  the  model's  general  processing  style.  Gasser  cites 
a  case  in  which  a  Japanese  speaker  inserts  an  English  word, 
"spoil",  into  an  otherwise  Japanese  sentence,  presumably  because 
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the  "connections"  from  this  concept  to  linguistic  expression  are 
stronger  to  English  than  to  Japanese  (ano  okanemoti  wa  ozyoosan 
o  spoil  sita,  That  rich  man  spoiled  his  daughter,'  Gasser,  1988, 
p.  7).  In  order  to  model  this  behavior,  Gasser  manually  set  the 
connection  "weights"  so  that  the  English  word  'spoil'  was  stronger 
than  the  equivalent  Japanese  expression,  and  was  thus  naturally 
chosen.  Gasser's  point  in  presenting  this  example  was  not  to 
demonstrate  transfer  (which  is  after  all  only  a  general  side  effect  of 
the  processing  style),  but,  rather,  how  a  single  concept  ('spoil') 
may  map  into  more  than  one  linguistic  structure  in  the  model. 
These  linguistic  structures  were  also  specified  beforehand  by 
Gasser,  through  use  of  such  high-level  linguistic  constructs  as 
"transitive  clause,"  "direct  object,"  and  "verb". 

Although  learning  is  an  important  component  of  transfer, 
Gasser's  model  does  not  actually  learn  its  own  representations. 
However,  Shirai  speculates  on  how  the  connections  to  'spoil'  may 
have  been  acquired:  "if  a  Japanese-speaking  learner  of  English 
keeps  saying  'spoil'  instead  of  'amayakasu',  the  association  will  be 
stronger,  and  it  will  be  easier  for  him  to  say  the  word  when 
speaking  Enghsh.  This  process  would  constitute  learning  the  word 
"spoil"  for  the  learner"  (p.  97).  As  in  the  earlier  examples,  though, 
the  argumentation  is  circular,  and  Shirai  has  in  effect  just  restated 
the  problem  to  be  explained.  How  did  the  learner  acquire  the  word 
"spoil"  in  the  first  place?  How  are  associations  formed, 
strengthened,  or  weakened?  How  are  words  or  sentences  actually 
represented?  How  may  this  be  implemented  in  a  real  brain? 

While  Shirai  uses  Gasser's  model  to  broadly  illustrate 
connectionism,  Gasser  (1988)  himself  says  that  his  "localized" 
model  is  unlike  most  connectionist  approaches  in  that  he  started  out 
with  the  sorts  of  representations  that  one  would  find  in  a  symbolic 
model,  such  as  linguistic  "schemas",  and  nodes  for  NPs,  direct 
objects  and  accusative  case.  He  then  implemented  them  in  a 
connectionist  network  to  see  if  simple  bilingual  sentences  could  be 
generated  with  this  new  processing  style.  Certain  desirable 
properties,  such  as  automatic  generalization  and  cross-linguistic 
transfer,  come  with  this  type  of  processing  for  free.  "Pure" 
connectionist  systems,  which  use  fully  distributed  representations, 
do  not  start  with  such  high-level  constructs,  but  develop  their  own 
representations.  However,  these  networks  cannot  perform  very 
complex  linguistic  tasks,  and,  as  noted  above,  connectionists 
working  with  language  often  use  the  more  traditional  approaches  or 
theories  implemented  in  a  connectionist  processing  style — in  other 
words,  hybrid  models. 
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To  sum  up  the  discussion  so  far,  I  have  argued  that  Shirai 
has  presented  neither  a  clear  account  of  the  particular  language 
transfer  phenomena  to  be  explained  nor  a  model  that  would  be  able 
to  implement  such  a  theory  if  it  existed.  As  Boden  (1990)  points 
out,  any  computer  model,  whether  symbolic  or  connectionist, 
cannot  embody  a  psychological  theory  without  having  a  theory  to 
begin  with.  Connectionists  have  had  more  success  with  modeling 
low-level  aspects  of  vision  because  neural  theories  of  low-level 
visual  processing  exist.  The  very  early  stages  of  visual  processing 
are  known  to  involve  massively  parallel  neural  computation,  which 
is  suitably  implemented  within  connectionist  networks.  There  is  no 
connectionist  theory  of  language  as  yet,  although  traditional 
linguistic  theories  of  competence  and  performance  work  well  with 
more  traditional  forms  of  computing  in  Artificial  Intelligence  (AI). 
The  next  section  considers  how  connectionist  models  might  help  to 
develop  such  a  theory. 


CONNECTIONISM:    IMPLEMENTATION  OR 
EXPLANATION? 


An  ongoing  debate  in  the  AI  literature  has  been  whether  a 
computer  simulation  of  human  intelligence  may  constitute  an 
explanation  of  it  (see,  for  example,  the  exchange  between  Searle, 
1990,  and  Churchland  &  Churchland,  1990).  Some  critics  of 
conventional  AI  as  a  model  of  human  cognition  see  connectionism 
as  a  more  neurally  plausible  glimpse  into  the  "black  box"  (e.g., 
Churchland,  1986;  Dreyfus  &  Dreyfus,  1985),  and  Shirai  is  clearly 
a  proponent  of  this  position.  However,  as  we  saw  above,  Shirai 
merely  points  to  a  vague  connectionist  framework  to  support  this 
point  of  view.  In  this  section,  I  will  raise  some  general  problems 
with  viewing  connectionist  modeling  as  an  explanation  of 
cognition,  although  it  could  play  a  role  in  developing  a  theory.  My 
own  view  is  that  connectionist  and  symbolic  models  are  both  useful 
for  studying  different  aspects  of  cognitive  processing. 

This,  of  course,  runs  counter  to  Shirai's  enthusiastic 
presentation  of  connectionism  as  a  potential  "paradigm  shift"  in 
psychology  and  linguistics,  a  proposition  that  I  feel  is  not  only 
fundamentally  misguided,  but  also  unproductive  for  the  field. 
Shirai  fails  to  mention,  for  instance,  that  Clark  (1989)  considers 
such  polarization  to  be  "an  extreme  danger"  to  the  cooperative 
efforts  of  cognitive   science.     While  Shirai   notes  that  the 


326    Fantuzzi 

mechanisms  by  which  a  connectionist  system  might  handle 
conscious  rule-learning  are  "unclear"  (p.  108),  he  relegates  Clark's 
own  proposal  for  a  hybrid  model  to  a  footnote  (but  see  Fantuzzi, 
1991),  and  says  that  he  argues  strongly  for  the  more  radical 
approach  "because  it  offers  a  new  perspective"  (p.  114).i 
However,  even  a  radically  new  perspective  does  not  constitute  a 
paradigm  shift  and,  at  the  very  least,  connectionism  may  be 
integrated  with  traditional  symbol  systems,  if  not  viewed  as  simply 
implementing  them.  The  mere  existence  of  hybrid  models 
invalidates  the  claim  that  these  are  separate  paradigms,  and  the  vast 
majority  of  connectionists  see  themselves  as  building  upon  rather 
than  supplanting  previous  work  (see  Boden,  1990,  for  an  excellent 
review  and  a  discussion  of  this  issue).  At  any  rate,  there  is  just  no 
evidence  as  yet  that  connectionist  models  can  provide  a  better 
account  of  linguistic  phenomena  than  symbolic  models  do. 

An  argument  is  typically  made  that  parallel  distributed 
processing  is  more  brain-like  than  traditional  AI  computing  because 
serial  computations  in  the  brain  would  take  too  long,  and  highly 
interconnected  and  redundant  connectionist  processing  units  are 
more  like  real  neurons  in  that  they  display  graceful  degradation 
rather  than  complete  disruption  if  one  part  of  the  system  fails  (e.g., 
Feldman  &  Ballard,  1982;  Sokolik,  1990).  However,  these  and 
many  similar  arguments  have  been  countered  as  being  mere  details 
of  implementation,  and  not  a  principled  distinction  between 
connectionist  and  symbolic  models  of  cognitive  processes  (Fodor 
&  Pylyshyn,  1988).  Many  point  out  that  localist  connectionist 
systems  are  not  fundamentally  different  from  symbol  systems, 
since  both  store  patterns  representing  symbols  in  the  network  (e.g., 
Fodor  &  Pylyshyn,  1988;  Bechtel,  1987).  The  model  on  which 
Shirai  bases  most  of  his  discussion  is  of  course  such  a  model. 
Whether  or  not  there  are  also  explicit  rules  represented  in  the 
system  is  not  necessarily  a  central  issue  for  linguists  (Fodor  & 
Pylyshyn,  1988;  Stabler,  1983). 

The  more  radical  fully  distributed  connectionist  models  do 
offer  a  clearer  alternative  to  symbolic  AI  since  they  do  not 
distinguish  representations  (symbols)  from  the  physical  functioning 
of  the  system  itself  (Cummins  &  Schwartz,  1987).  However,  an 
important  question  is  how  well  these  systems  can  handle  complex 
phenomena  such  as  language.  While  simple  pattern  association 
may  be  one  part  of  language  learning,  it  is  hard  to  see  how  sentence 
structure  and  abstract  symbols  such  as  NP  can  be  completely 
absent  from  it,  and  the  current  state  of  connectionist  research  gives 
us  no  reason  to  discard  them.    Pinker  and  Prince  (1988),  for 
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example,  convincingly  argue  that  children's  double-marking  errors 
(wents,  thoughted)  are  not  due  to  pattern  blending,  but  to  the 
inflection  of  the  wrong  stem,  although  the  notion  of  "stem"  or 
"affix"  is  unrepresentable  in  a  connectionist  system.  Kim,  Pinker, 
Prince  and  Prasada  (1991)  demonstrate  that  a  native  speaker's 
representation  of  past  tense  formation  includes  the  intuition  that 
denominalized  verbs  take  the  regular  past  by  default  (The  football 
player  grandstanded  to  the  crowd;  The  baseball  player  filed  out  to 
left  field).  This  also  is  unrepresentable  in  a  "pure"  connectionist 
system.2 

Because  fully  distributed  connectionist  systems  cannot 
handle  structured  knowledge  very  well,  connectionists  working 
with  complex  problems  of  language  often  incorporate  symbols  into 
their  connectionist  architectures  (Hinton,  1991).  Cummins  and 
Schwartz  refer  to  this  type  of  model  as  "conservative 
connectionism,"  and  Pinker  and  Prince  (1988)  refer  to  it  as 
"revisionist-symbol-processing  connectionism,"  as  opposed  to  the 
radical  "eliminativist"  position  that  Churchland  (1986)  or  Shirai 
advocate.  It  has  even  been  suggested  that  this  dual-mode  of 
processing  may  reflect  the  different  types  of  processing  that  people 
seem  to  do  (Schneider  and  Schiffrin,  1977;  Schneider,  1987,  1988; 
Clark,  1989).  When  symbols  are  distributed  over  many  units,  one 
gets  the  same  performance  benefits  that  one  has  with  a 
connectionist  system:  robustness,  redundancy,  resistance  to  noise 
or  damage,  automatic  generalization  and  so  on  (Fodor  &  Pylyshyn, 
1988),  including  the  sort  of  transfer  effects  that  are  seen  in 
Gasser's  (1988)  localist  model. 

One  problem  with  claiming  that  the  fully  distributed  models 
can  explain  a  certain  phenomenon  (a  problem  that  one  doesn't  have 
with  symbolic  models)  is  that  it  is  difficult  to  see  exactly  how  the 
distributed  models  arrive  at  their  solution.  Hinton  (1991)  points 
out  that  when  there  are  several  hidden  layers,  as  there  are  in  many 
of  the  learning  networks,  it  is  hard  to  say  what  each  hidden  unit 
represents.  Although  the  system  arrives  at  a  solution  to  its  task, 
even  its  designer  does  not  know  how  it  has  done  so,  and  an 
"existence  proof  that  a  model  can  perform  in  a  certain  way  is  not  a 
good  "explanation."  As  Klein  (1990)  notes: 

Connectionists  make  models  tick,  but  do  not  make 
us  understand  as  yet  what  makes  them  tick.  Turning  now  to 
SLA  more  specifically,  we  do  not  just  want  a  network  which, 
when  fed  with  sufficient  input  in  the  form  of  sentences, 
provides  us  with  the  appropriate  regular  and  irregular 
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morphology.  We  want  to  know  the  principles  according  to 
which  the  human  mind  breaks  down  the  sound  stream  into 
smaller  parts,  assigns  structure  and  meaning  to  these,  retreats 
from  false  generalizations,  and  the  like.  (p.  226) 

Clark  (1990)  uses  the  text/phoneme  conversion  model, 
NETtalk  (Sejnowski  &  Rosenberg,  1986),  to  illustrate  this 
problem.  While  the  model  does  not  encode  traditional  phonological 
rules,  it  is  given  a  rich  prior  analysis  of  its  domain  through  the 
choice  of  input  and  output  representation,  hidden  unit  architecture 
and  learning  rule.  Although  the  network  was  95%  successful  in  its 
task  of  converting  text  to  speech  after  50,000  trials  of  "supervised" 
learning,  even  its  designers  don't  know  how  it  actually  performed 
the  task,  and  Clark  discusses  various  strategies  that  have  been  used 
to  investigate  how  it  was  done,  and  to  try  to  discover  the  principles 
by  which  the  system  arrives  at  its  solution. 

McCloskey  (1991)  makes  essentially  the  same  kinds  of 
observations  about  Seidenberg  and  McClelland's  (1989) 
connectionist  model  of  word  recognition  and  naming.  He  argues 
that  theoretical  proposals  for  cognitive  functions,  tied  to  specific 
descriptions  of  particular  networks,  are  too  vague  to  be  explicit 
theories  of  cognitive  functions,  although  they  are  valuable  tools  for 
developing  theories.  For  example,  while  the  performance  of 
Seidenberg  and  McClelland's  model  matches  the  performance  of 
human  subjects,  the  authors  cannot  specify  the  things  that  a  theory 
should  specify:  what  regularities  and  idiosyncrasies  of  the 
orthographic  and  phonological  representation  of  words  are  encoded 
by  the  network  (i.e.,  how  do  people  encode  the  different 
representations  of  the  letter  a  in  various  contexts);  how  the 
acquired  knowledge  is  actually  represented  in  the  network;  and 
how  the  propagation  of  activation  tlu^oughout  the  network  results  in 
the  appropriate  representation  being  chosen  in  the  appropriate 
context  (e.g.,  the  appropriate  a  in  hat,  hate,  have).  While 
Seidenberg  and  McClelland  have  provided  an  explicit  computer 
simulation  of  a  cognitive  behavior,  McCloskey  argues  that  the 
underlying  theory  of  human  cognition  remains  vague:  just  general 
statements  to  the  effect  that  representations  are  distributed  and 
similar  words  are  represented  similarly.  The  problem  is  that  the 
dynamics  of  complex  nonlinear  connectionist  systems  are  difficult 
to  analyze,  and  thus  to  understand.  Unless  one  is  satisfied  with  an 
"existence  proof"  that  something  can  be  modeled,  the  models  still 
do  little  by  way  of  explaining  the  behavior.  They  again  provide  a 
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"black  box"  simulation  of  a  cognitive  behavior  rather  than  a  theory 
of  it. 

Another  important  difference  between  connectionism  and 
traditional  AI,  then,  is  the  way  that  each  constructs  theory: 

The  connecUonist,  by  whatever  means,  achieves  her 
high-level  understanding  of  a  cognitive  task  by  reflecting  on, 
and  tinkering  with,  a  network  which  has  learnt  to  perform 
the  task  in  question.  Unlike  the  classical  Marr-inspired 
theorist,  she  does  not  begin  with  a  well  worked  out 
(sentential,  symbolic)  competence  theory  and  then  give  it 
algorithmic  flesh.  Instead  she  begins  with  a  level-0.5  theory, 
trains  a  network,  and  then  seeks  to  grasp  the  high  level 
principles  it  has  come  to  embody.  (Clark,  1990,  p.  303, 
emphasis  his) 

Thus,  having  built  a  distributed  network  which  successfully 
completes  a  certain  task,  the  next  problem  is  to  seek  to  understand 
the  principles  that  caused  the  behavior,  the  same  task  facing  a 
neuroscientist  who  studies  a  real  neural  network.  Clark  uses  the 
term  level  0.5  theory  (as  opposed  to  level  0,  which  is  no  theory  at 
all  and  level  3,  which  would  be  a  high-level  competence  theory)  to 
refer  to  the  fact  that  some  amount  of  theory  must  be  used  to  set  up 
the  program  in  the  first  place.  Differing  amounts  of  a  priori  theory 
may  be  applied  to  set  up  a  program,  but  they  all  make  some  initial 
assumptions,  even  if  it  is  only  which  features  will  be  represented 
on  which  units  and  how  many  units  or  hidden  layers  are  needed  to 
do  the  task. 

One  important  issue  may  be  the  "psychological  reality"  of 
these  assumptions.  Pinker  and  Prince  (1988)  and  Lachter  and 
Bever  (1988),  for  example,  presented  a  sharp  critique  of  the 
assumptions  underlying  Rumelhart  and  McClelland's  (1986)  past 
tense  acquisition  model.  Clark  (1990)  points  out  that  even  when 
the  initial  assumptions  are  minimal,  they  may  be  psychologically 
unrealistic;  for  example,  the  amount  of  units  used  and  the  best 
form  of  the  solution  must  be  specified  beforehand.  Another 
question  about  the  psychological  reality  of  connectionist  systems 
that  linguists  often  raise  is  the  assumption  that  there  is  an  explicit 
"teacher"  which  looks  at  the  output  and  incrementally  corrects  it,  a 
quite  implausible  suggestion  for  first  language  acquisition. 

Fodor  and  Pylyshyn  (1988)  argue  forcefully  that 
connectionism  may  at  best  provide  an  account  of  the  "abstract 
neural"  structures  in  which  symbolic  "cognitive  structures"  are 
implemented,  each  thus  representing  a  distinct  level  of  cognitive 
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modeling.  But  then  again,  the  tasks  that  these  distributed  models 
can  perform,  such  as  transcribing  text  to  speech,  adding 
phonological  past  tense  endings  to  verbs,  or  recognizing  words 
from  non-words,  are  not  the  sort  of  complex  linguistic  problems 
which  usually  occupy  linguists.  Furthermore,  the  behavior  of  the 
models  doesn't  always  match  human  behavior  (for  discussion  see 
Pinker  &  Prince,  1988).  Clearly,  what  connectionist  models  may 
someday  teach  us  about  how  humans  process  language  is  still  very 
much  an  open  question. 

McCloskey  (1991)  suggests  that  connectionist  models  may 
be  viewed  as  "animal  models"  of  human  functions.  He  argues  that 
an  animal  model  is  not  itself  a  theory  or  a  simulation  of  a  theory, 
but  an  object  of  study  which  may  lead  to  theories  of  human 
systems.  In  the  same  way,  artificial  neural  networks  may  also  be 
easier  to  study  and  analyze  than  actual  brains,  and  thus  may  one 
day  help  to  develop  a  theory  of  human  processing,  although  there 
is  no  connectionist  theory  of  cognition  as  of  yet.  As  Clark  says, 
explanation  in  connectionism  requires,  at  the  minimum,  "reflecting 
on,  and  tinkering  with,  a  network  which  has  learnt  to  perform  the 
task  in  question"  and  then  seeking  the  principles  it  has  come  to 
embody.  Arm-chair  speculating  on  the  future  capability  of  models, 
as  Shirai  does,  certainly  will  not  explain  issues  in  SLA.  A  clearer 
discussion  of  theory,  explanation  and  of  the  underlying 
assumptions  and  actual  capabilities  of  existing  models  must  be 
present  in  any  discussion  of  the  applicability  of  these  models  to 
SLA  research. 

This  section  has  pointed  out  some  problems  with  viewing 
any  connectionist  model  as  an  "explanation"  of  linguistic 
phenomena  and  takes  issue  with  Shirai's  presentation  of 
connectionism  as  a  potential  paradigm  shift  in  cognitive  modeling. 
The  next  section  considers  how  neurobiologically  plausible  the 
models  are. 


CONNECTIONIST  MODELS  AND  NEURAL 
PLAUSIBILITY 


As  Rumelhart  and  McClelland  (1986,  Part  V)  point  out, 
connectionist  models  are  considered  to  be  neurally  plausible  to 
varying  degrees.  While  certain  models  of  psychological  processes, 
such  as  Gasser's  (1988)  sentence  production  model,  are  "neurally 
inspired,"   one  could   say   that  this   inspiration   is   minimal 
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(Schumann,  1990a).  Some  connectionist  models  are  much  more 
biologically-oriented,  such  as  Munro's  (1986)  model  of  the 
development  of  ocular  dominance  in  the  visual  cortex,  and  this 
model  will  be  discussed  briefly  below. 

The  Neural  Plausibility  of  Connectionist  Algorithms 

As  noted  above,  some  connectionist  learning  algorithms 
have  been  criticized  as  being  psychologically  implausible,  since 
they  rely  heavily  on  constant  feedback  from  an  external  "teacher" 
who  knows  what  the  correct  answer  should  be.  One  widely-used 
learning  algorithm,  known  as  back-prop,  is  criticized  as  being 
neurally  implausible  as  well,  because  real  neurons  do  not  feed  error 
information  back  down  the  neurons  so  that  they  can  re-adjust  their 
connections  (Thorpe  &  Imbert,  1989). 

Shirai  describes  memory  and  learning  in  very  general  terms: 
as  the  strength  of  connections  between  "nodes"  in  a  network  and 
the  "activation"  and  "firing"  of  patterns  of  nodes.  Transfer  is  the 
selection  of  a  pattern  of  nodes  that  are  more  strongly  associated 
with  an  input  representation.  He  describes  learning  in  the 
connectionist  model  in  this  way: 

Essentially,  the  more  often  a  particular  node  at  the 
ends  of  connections  are  activated  and/or  fired,  the  stronger  the 
connections  become;  consequently  stronger  connections 
become  more  easily  activated,  and  this  greater  ease  of 
activation  causes  more  learning,  (p.  96) 

This  corresponds  to  the  very  simplest  learning  algorithm, 
known  as  the  Hebb  rule,  which  is  indeed  an  abstraction  of  actual 
neuronal  processing  (Hebb,  1949).  McClelland,  Rumelhart,  and 
Hinton  (1986)  describe  Hebbian  learning  in  this  way:  "When  unit 
A  and  unit  B  are  simultaneously  excited,  increase  the  strength  of 
the  connection  between  them"  (p.  36). 

This  rule  may  be  adjusted  to  cover  both  positive  and 
negative  activation  values  (excitation  and  inhibition).  However, 
McClelland  and  colleagues  go  on  to  say  that  because  the  Hebb  rule 
has  some  limitations,  most  connectionists  do  not  generally  use  it  in 
this  form  for  more  complex  computations  but  have  devised  more 
sophisticated  learning  algorithms,  such  as  the  "delta  rule"  (which 
Sokolik  1990,  discussed  below,  uses)  and  "back-prop",  (which 
Gasser  1990  uses). 
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While  there  does  seem  to  be  neurobiological  support  for 
Hebbian  learning,  it  is  unknown  how  much  this  very  simple  type 
of  associative  learning,  which  is  observed  in  simple  slugs 
responding  to  electric  shocks,  is  involved  in  higher  cognitive 
functions.  Nevertheless,  Shirai  describes  complex  human  learning 
in  such  simple  terms.  For  example,  as  slugs  "learn"  to  associate 
shocks  with  light,  Shirai  talks  about  people  learning  to  associate  LI 
concepts/words  with  L2  words  through  the  simple  strengthening  of 
connections.  While  we  certainly  cannot  say  that  some  human 
learning  is  not  due  to  this  type  of  conditioning,  it  may  be  a  leap  of 
faith  to  attribute  complex  language  learning  or  language  transfer  to 
simple  associations  between  simultaneous  inputs  -  especially  in 
light  of  the  enormous  difficulties  connectionist  systems  have  in 
representing  complex  linguistic  knowledge  and,  of  course,  the 
many  convincing  arguments  from  generative  linguistics  to  the 
contrary.  Shirai  himself  brings  up  the  point  that  Chomsky  (1957, 
1959)  effectively  defeated  the  behaviorist  paradigm,  but  offers  little 
compelling  evidence  that  a  neo-behaviorist  revolution  is  in  the 
making. 

According  to  Shirai,  once  LI  connections  are  "formed  and 
solidified  as  a  system,"  subsequent  alteration  of  connections  may 
become  difficult  (p.  107).  He  relates  this  to  the  notion  of 
unitization  (Kennedy,  1988)  at  the  information-processing  level: 
once  knowledge  becomes  automatized  and  "solidified"  as  a  unit  it  is 
difficult  to  alter  later.  However,  how  connectionism  explains  age- 
related  transfer  at  either  the  so-called  neural  (connectionist)  or 
psychological  level  is  again  quite  vague.  Although  he  suggests  that 
"frequent"  or  "salient"  or  even  innate  connections  may  become  "too 
strong  to  alter  later  in  life"  (p.  103),  we  are  still  faced  with  the 
problem  of  how  connections  are  formed,  how  later  learning  occurs, 
how  a  crucial  balance  is  maintained  between  the  malleability  and 
rigidity  of  connections,  and  how  real  neurons  function.  Shirai 
refers  to  Munro's  model  of  the  development  of  the  visual  cortex  as 
a  possible  connectionist  explanation  of  "age-related"  conditions  on 
transfer,  but  offers  no  discussion  of  it. 

Munro's  Model  of  a  Critical  Period 
for  Visual   Processing 

Munro  (1986)  presents  a  mathematical  model  of  a  specific 
neural  system  whose  circuitry  is  relatively  well  known,  the  visual 
cortex.  He  argues  that  the  degree  of  plasticity  in  single  neurons 
may  reflect  sensitive  periods  in  learning,  although  sensitive  periods 
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do  not  necessarily  reflect  decreasing  plasticity  in  neurons.  Munro 
does  not  propose  that  existing  neural  connections  may  become 
"solidified"  or  difficult  to  modify,  but  that  uncommitted  neurons 
will  naturally  change  their  state  more  rapidly  and  easily  than  already 
committed  neurons.  Further,  he  suggests  that  this  type  of  plasticity 
may  hold  only  for  the  earhest  stages  of  cognitive  processing,  which 
in  the  domain  of  language  acquisition  might  be  phoneme 
recognition,  and  that  higher  cognitive  processes  may  not  show  a 
sensitive  period  at  all.  This  is  of  course  an  empirical  question. 
Whether  Munro's  framework  might  be  applied  to  cognitive  systems 
more  generally,  and  to  issues  in  language  acquisition  in  particular, 
remains  an  open  question.  However,  the  connection  that  Shirai 
attempts  to  make  between  Munro's  model  and  issues  of  transfer  in 
SLA  is  pitched  at  much  too  general  a  level,  which  is,  simply,  that  a 
reduction  in  the  modifiability  in  neural  connections  might  be  one 
factor  leading  to  a  sensitive  period  for  some  language  functions. 
We  then  need  to  ask  which  functions,  which  neurons,  why,  and 
how. 

Sokolik's  PDP  Model  for  a  Critical  Period  in  SLA 

Munro's  approach  to  modeling  a  critical  period  for  the 
visual  cortex  may  be  compared  with  an  explicit  PDP  model  of  age- 
dependent  conditions  on  language  acquisition  that  has  been 
proposed  in  the  SLA  literature.  Sokolik  (1990)  notes  that  a  protein 
known  as  Nerve  Growth  Factor  (NGF)  is  thought  to  be  linked  to 
rate  of  learning.  If  children  have  a  higher  amount  of  this  protein 
than  adults,  and  we  assume  that  NGF  affects  the  ability  to  learn 
languages  more  quickly,  then  we  have  a  principled  physiological 
basis  for  setting  the  "learning  rate"  in  a  connectionist  algorithm 
higher  for  children  than  for  adults.  Set  at  a  higher  value,  a  PDP 
model  will  learn  more  quickly,  which,  Sokolik  suggests,  may  offer 
an  explanation  for  why  children  acquire  second  languages  "more 
readily"  than  adult  language  learners  do. 

Sokolik  presents  an  example  in  which  the  learning 
parameter  for  the  acquisition  of  a  certain  feature  is  set  higher  in  the 
child  PDP  model  than  it  is  in  the  adult  PDP  model.  The  child 
model,  therefore,  attains  near  mastery  of  the  feature  more  quickly 
than  the  adult  model.  However,  there  are  at  least  three  problems 
with  this  scenario,  other  than  the  psychological  reality  of  the 
learning  algorithm  itself,  as  discussed  above.  Sokolik  herself 
mentions  that  the  significance  of  NGF  to  learning  rate  is 
speculative,  and  ignores  other  factors  that  may  be  involved  in 
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variable  learning  for  adults,  such  as  psychological  and  sociological 
factors  (Schumann,  1990b).  A  second  problem  is  that  much 
empirical  research  has  suggested  that  adult  language  leamers  may  at 
first  be  quicker  at  acquiring  a  second  language  than  children,  but 
that  children  overtake  them  in  the  long  run  (Long,  1988).  A  third 
problem  is  that,  even  at  the  level  of  the  single  neuron,  assuming 
that  NGF  is  a  factor  in  learning  rate  is  simply  an  ad  hoc  explanation 
of  critical  period  effects.  Munro  contrasts  his  own  position  with 
the  popular  idea  that  the  neurotransmitter  norenephrine  might  act  as 
a  global  modulator  of  neuronal  plasticity  and  thus  account  for  lower 
learning  rates.  He  points  out  that  one  may  always  simply  add  an 
explicit  learning  rate  into  one's  learning  rule  in  order  to  obtain 
certain  pragmatic  results,  or  factor  in  global  modulators  which 
affect  the  learning  rate,  but  this  is  unnecessary.  As  we  saw  above, 
his  solution  for  a  critical  period  is  simply  that  uncommitted  neurons 
form  their  connections  more  easily.  But  again,  the  translatability  of 
his  particular  model  to  language  acquisition  issues  is  not 
straightforward. 

What  the  above  discussion  makes  clear  is  that  the  focus  of 
Sokolik's  and  Munro's  PDP  models  is  not  on  how  proficiency 
changes  as  a  result  of  changes  in  the  form  of  the  mental 
representations,  but  as  a  result  of  a  change  in  a  learning  rule,  or  in 
the  weights  and  connectivity  of  the  processing  units  themselves.  A 
connectionist  system  is,  as  Bialystok  (1990)  points  out,  the 
quintessential  processing  model.  But  since  the  models  only  apply 
to  on-line  processing  (e.g.,  learning)  and  do  not  apply  over  time 
(i.e.,  to  development),  they  perform  quite  different  tasks  than 
competence  models,  which  are  concerned  in  detail  with  changes  in 
structured  mental  representation.  Therefore,  Bialystok  argues, 
PDP  models  may  be  construed  as  models  of  processes  rather  than 
of  the  mental  representations  which  are  the  focus  of  competence 
theories.  The  two  approaches,  representing  different  sides  of  the 
competence-performance  distinction,  may  eventually  co-exist  as 
complementary  explanations  for  different  problems. 

The  important  point  is  not  to  maintain  a  strict  dichotomy 
between  performance  and  competence,  of  course,  but  to  realize 
that  different  aspects  of  cognitive  modeling  may  be  reconciled  into 
one  whole.  This  direcdy  relates  to  the  issue  of  "levels"  of 
explanation  mentioned  in  the  last  section.  The  descriptions  of 
cognitive  behavior  at  the  level  of  neural  processes,  of  connectionist 
networks,  or  of  competence  theories  may  be  viewed  as  different 
levels  of  abstraction  (Fodor  &  Pylyshyn,  1988).  Clearly,  we  are 
not  yet  at  the  point  where  we  can  say  that  connectionism,  itself  an 
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abstraction  of  neural  processing,  will  ever  be  able  eliminate  the 
higher  levels. 


CONCLUSION:    CA  REVISITED 


It  appears  that  the  general  thrust  of  Shirai's  article  is  to 
revive  a  Contrastive  Analysis  (CA)  approach  to  transfer  by 
buttressing  its  theoretical  framework  with  connectionism.  This 
topic  alone  is  a  broad  one,  and  by  focussing  on  it  Shirai  may  have 
been  able  to  cover  at  least  one  area  of  his  article  in  more  depth. 

A  radical  connectionist  approach,  which  Shirai  clearly 
favors,  is  incompatible  with  parameter  setting  (White,  1985, 
Flynn,  1987;  Hilles,  1986)  or  with  a  learner's  own  internal 
contribution  to  learning  (e.g.,  the  natural  order  hypothesis  in  the 
acquisition  of  morphemes).  Indeed,  Shirai  suggests  that  "natural 
order"  phenomena  "can  be  explained  by  'naturalness  factors  such 
as  perceptual  saliency,  frequency  and  invariance  of  forms,  as  well 
as  by  the  'LI'  factor."  As  he  puts  it: 

In  connectionist  terms,  such  a  claim  can  be 
interpreted  as  follows:  the  naturalness  factor  makes  it  easy 
for  a  particular  form  to  be  connected  to  a  particular 
meaning/function.  It  will  be  easy  to  identify  and  easy  to 
match;  there  will  be  many  opportunities  to  strengthen 
connections.  This  will  result  in  the  Natural  Order,  (p.  100) 

However,  this  explanation  completely  sidesteps  the 
problem  of  what  is  meant  by  "salient"  or  "easy"  to  map,  why  some 
frequent  items  are  not  learned  first,  how  this  mapping  is  done, 
etc. — that  is,  all  of  the  issues  that  are  of  interest  to  SLA 
researchers.  While  the  simple  mapping  strategy  that  Shirai 
describes  is  intuitively  plausible,  it  is  notoriously  difficult  to 
establish  causal  relations  between  these  "naturalness"  factors  and 
language  acquisition,  as  the  wide  literature  on  morpheme 
acquisition  shows.  Also,  as  Larsen-Freeman  and  Long  (1991) 
point  out,  the  claim  that  morphological  development  shows  much 
commonality  across  unrelated  languages,  pointing  to  some 
internally  driven  organization  of  input,  has  simply  been  too  well 
documented  to  be  ignored. 

In  a  similar  vein,     Shirai  argues  that  connectionism 
explains  "interlingual"  mapping  between  LI  and  L2  because  "when 
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a  new  pattern  is  encountered  which  is  similar  to  another  existing 
pattern  in  the  learner's  representation,  the  new  pattern  would 
activate  the  existing  pattern"  (p.  111).  How  similar  must  they  be  to 
be  activated?  What  defines  similarity?  Why  are  some  items  not 
transferred  but  learned?  What  happens  in  the  network  if  patterns  are 
not  similar? 

Gasser  (1990),  of  course,  has  applied  the  "connectionist 
framework"  to  an  actual  test  of  the  CA  hypothesis.  One  thing  that 
he  found  was  that  the  learning  performance  of  the  model  was  more 
complex  than  "traditional"  CA  would  predict,  namely  that  all 
differences  between  L1-L2  should  be  equally  difficult  to  learn.  The 
independent  variables  of  LI/  L2  that  Gasser  modeled  (word  order 
and  lexical  form)  in  fact  showed  an  effect  of  interaction.  While  he 
admits  that  the  conditions  of  the  model  were  a  gross 
oversimplification  of  an  actual  language  learner's  task,  Gasser 
notes  that: 

(W)hile  these  results  should  be  regarded  very  tentatively,  they 
point  to  a  possible  line  of  connectionist  research,  one  in 
which  networks  test  out  particular  hypotheses  about  transfer 
and  suggest  what  types  of  data  are  needed  to  flesh  out  the 
transfer  picture.  The  main  conclusion  to  be  drawn  from  these 
simulations  is  that,  even  with  this  extremely  simple  model 
of  the  transfer  process,  it  was  impossible  to  predict  precisely 
how  the  network  would  behave.  Thus  simulations  have  an 
important  role  to  play.  (p.  196) 

While  connectionist  networks  may  indeed  provide  new 
ways  of  testing  our  hypotheses  about  language  processing  and 
learning,  simulations  serve  to  help  develop  and  refine  our  theories 
of  language,  not  to  eliminate  them.  The  connectionist  approach  as 
Shirai  describes  it  does  not  provide  new  and  more  sophisticated 
interpretations  of  language  transfer  or  new  insights  into  the  role  of 
Contrastive  Analysis  (CA)  in  predicting  language  transfer.  The 
role  of  CA,  and  the  connectionist  explanation  of  transfer,  are 
treated  with  the  same  brevity  and  superficiality  as  is  his  discussion 
of  the  connectionist  framework  itself  While  Shirai  has  given  us  an 
informative  overview  of  the  conditions  thought  to  influence  transfer 
in  SLA,  its  tie-in  to  connectionism  may  have  benefitted  from  a 
more  narrow  focus,  perhaps  a  closer  look  at  Gasser's  (1990) 
connectionist  model  of  transfer  and  a  more  detailed  discussion  of 
its  implications  for  a  CA  position  than  Gasser  himself  provides. 
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As  has  been  made  abundantly  clear  throughout  this  paper,  I 
believe  that  Shirai's  claim  for  a  connectionist  explanation  of 
transfer  is  greatly  overstated.  Second  language  researchers  who 
are  to  start  research  projects  in  the  connectionist  framework  will 
need  to  know  more  precisely  how  models  work  and  how  they  may 
be  applied  to  particular  problems.  I  am  of  the  opinion  that 
connectionist  models  will  probably  never  replace  higher-level 
explanations  in  cognitive  modeling,  although  they  may  help  to 
develop  theories  at  the  level  of  implementation. 
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NOTES 


1  Indeed,  his  token  references  to  some  potential  compromises  belie  the 
obvious  prominence  given  to  the  "alternative  view"  throughout  the  paper.  He 
appears  to  treat  the  distinctions  between  various  types  of  connectionist  models  as 
some  irrelevant  detail. 

2  Although  attempts  have  been  made  to  improve  the  original  model  which 
was  the  focus  of  Pinker  &  Prince's  extensive  criticisms,  these  particular  problems 
have  not  been  addressed;  indeed,  that  would  require  adding  symbols  to  the  network. 
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Bailey  has  written  extensively  on  foreign  language  teaching 
and  applied  linguistics,  as  well  as  on  linguistic  theory  and  the 
interface  between  all  these  areas  (e.g.,  Bailey,  1982,  1985,  1987). 
Why  More  English  Instruction  Won't  Mean  Better  Grammar 
(WMEIWMBG)  is  an  important  contribution  to  the  areas  of  Enghsh 
language  and  grammar  teaching  because  it  presents  a  systematic 
analysis  of  a  wide  range  of  grammatical  phenomena  where  none  has 
previously  been  apparent  and  applies  modem  linguistic  concepts  and 
devices  to  grammar  construction  (e.g.,  developmental  linguistics, 
generative  grammar,  and  sociolinguistics).  WMEIWMBG  is 
addressed  primarily  to  teachers  of  English  (and  to  grammarians)  and 
has  recendy  been  added  to  the  ERIC  list  of  the  U.S.  Department  of 
Education.  A  number  of  the  issues  discussed  in  this  work  are  dealt 
with  in  greater  detail  and  from  a  more  linguistic  perspective  in 
Bailey's  (forthcoming)  book. 

WMEIWMBG  is  divided  in  four  chapters  followed  by  one 
appendix.  Chapter  1  presents  a  systematic  account  of  a  wide  range 
of  English  grammatical  phenomena  which,  according  to  Bailey,  are 
neglected  by  most  grammarians.  He  does  not  mention  which 
grammars  he  has  in  mind.  The  structures  discussed  by  Bailey 
include  (not  necessarily  in  this  order):  (i)  rules  governing  the  use  of 
prepositions,  including  those  for  placing  prepositions  before  relative 
and  interrogative  pronouns;  (ii)  rules  for  the  use  of  interrogative  and 
personal  pronouns,  as  well  as  for  the  deletion  of  non-demonstrative 
that,  (iii)  a  systematic  characterization  of  the  English  verb,  including 
present,  past  and  timeless  tenses,  infinitives,  participles,  and 
gerunds,  as  well  as  the  pragmatic  (rather  than  grammatical)  use  of 
go  and  get,  the  elided  and  unelided  forms  of  have,  do,  and  got,  and 
here's,  there's,  where's  with  plural  predicates;  (iv)  principles  for 
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distinguishing  adverbs  having  and  not  having  the  ending  -ly,  and  the 
genitival  forms  's  and  of;  and  (v)  a  clear  and  systematic  analysis  of 
mass  nouns,  abstracts,  collectives,  and  generics,  as  well  as 
substitution  and  deletion  rules  changing  lexical  forms  in  various 
ways. 

Chapter  2  discusses  Bailey's  concept  of  grammatical  system 
with  particular  reference  to  the  English  verb.  Rather  than  the  loosely 
connected  lists  of  unrelated  tenses  found  in  many  grammars,  the 
English  verb  is  analyzed  here  as  a  structure  of  forms  derived  with 
explicit  principles  from  a  common,  but  small,  set  of  primes. 
Bailey's  grammatical  system  is  similar  to  Saussure's  (1962)  where 
"tout  se  tient" — it  all  hangs  together. 

Chapter  3  applies  modem  linguistics  concept  and  devices  to 
the  construction  of  grammars.  These  include:  (i)  writing  the  formal 
empty  form  e  to  replace  deleted  forms — making  grammatical 
phenomena  more  transparent  and  intelligible — as  well  as  the  use  of 
generative  linguistic  concepts  such  as  raising  and  other  types  of 
movement  (Chomsky,  1981,  1986);  (ii)  grammatical  devices 
employed  in  developmental  linguistics  such  as  markedness-reversal 
in  marked  environments,  e.g.,  reversals  in  negative  contexts  with 
auxiliaries,  modals,  passives,  etc.  (Bailey,  1984;  Faingold,  1991); 
and  (iii)  Labov's  (1978:  Chapter  8)  observer's  paradox,  showing 
that  speakers  who  deny  using  get-passives,  as  well  as  other 
structures,  do  in  fact  produce  these  forms  in  unmonitored  speech. 

Chapter  4  discusses  the  issue  of  grammatical  correctness  in 
the  English-speaking  world  and  elsewhere.  As  Bailey  shows, 
grammatical  correctness  in  English  is  determined  by  fashionable 
(typically  young)  speakers,  while  other  countries  (e.g.,  Spain, 
France,  (jermany)  have  language  academies  or  other  authorities  who 
determine  the  grammaticality  of  language  structures.  English 
dictionaries  are  replete  with  "errors"  because  they  record,  rather  than 
define,  what  is  acceptable;  in  contrast,  the  dictionaries  produced  by 
language  academies  or  other  authorities  are  by  definition  free  of 
errors  because  they  legally  define  language  form. 

The  appendix  defines  Bailey's  use  of  concepts  such  as 
modal  verbs,  verbids,  aspectuality,  modality,  and  marke(re)dness. 

WMEIWMBG  makes  an  invaluable  contribution  to  the 
teaching  of  English  grammar.  It  applies  modem  linguistic  models, 
reduces  terminological  and  structural  chaos,  and  provides  a 
systematic  analysis  of  a  wide  range  of  phenomena.  Specifically,  (i) 
spurious  grammatical  categories  (e.g.,  adverbial  nouns,  pronominal 
demonstratives,  the  "genitive"  case,  etc.)  are  reduced  to  the 
recognized  parts  of  speech  based  on  the  more  fundamental 
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grammatical  categories  of  case,  predication,  and  modification  with 
apposition;  (ii)  current  categories  such  as  subjunctives  are  seen  as 
processual  modalities,  anteriors  as  distinct  from  perfects,  and 
timeless  verbal  forms  (e.g.,  present  perfect)  as  separate  from  real 
presents  (e.g.,  present  continuous);  and  (iii)  seemingly  chaotic 
patterns  of  the  English  verb  are  systematically  handled  with 
developmental  devices  such  as  markedness-reversal— whereby 
systematic  and  predictable  reversals  occur  in  marked  categories  and 
environments  (e.g.,  in  timeless,  anterior,  posterior  quasi-temporal 
categories,  and  modals  under  negation,  interrogation,  and 
comparison). 
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An  Introduction  to  Second  Language  Acquisition 
Research  by  Diane  Larsen-Freeman  and  Michael  H.  Lx)ng.  London 
and  New  York:  Longman,  199L  xvii  +  398  pp. 

Reviewed  by 
Charlene  G.  Polio 
Michigan  State  University 

Over  the  past  few  years,  applied  linguistics  has  been  trying 
to  answer  the  question:  what  is  applied  linguistics?  (See 
discussions  on  this  question  in  Issues  in  Applied  Linguistics,  1990, 
1992.)  Second  language  acquisition  (SLA)  has  avoided  the 
potentially  polemic  question:  what  is  SLA?  While  there  is  little 
doubt  that  SLA  is  a  field  in  its  own  right  (see  Gass,  in  press; 
Larsen-Freeman,  1991),  what  constituted  mainstream  SLA,  or  the 
core  of  the  field,  may  not  be  agreed  upon.  As  the  field  grows  and 
fragments,  this  issue  needs  to  be  addressed.  Nowhere  is  the  issue 
of  defining  the  field  of  SLA  as  pertinent  as  in  the  writing  of  an 
introductory  SLA  textbook.  Ten  years  ago,  such  a  task  would  not 
have  been  as  formidable.  Today,  one  must  first  ask  what  should  be 
included  and  in  what  depth  should  it  be  covered? 

The  most  recent  effort  to  introduce  newcomers  to  the  field  of 
SLA  is  Larsen-Freeman  and  Long's  Introduction  to  Second 
Language  Acquisition  Research.  In  evaluating  such  an  effort,  one 
must  consider  what  the  authors  chose  to  include  and  what  to 
exclude.  Were  any  essential  research  or  concepts  omitted  and/or 
was  any  research  on  the  fringes  made  to  seem  part  of  the  field?  Will 
students  who  use  this  book  have  a  perspective,  consistent  with 
others  in  the  field,  on  what  SLA  is?  Have  the  authors  fulfilled  their 
responsibility  to  those  using  the  book  to  present  a  balanced  view  of 
a  field  that  is  fast  finding  researchers  disagreeing  on  basic  issues  and 
theoretical  frameworks?  I  believe  that  Larsen-Freeman  and  Long's 
book  can  be  evaluated  quite  positively  with  regard  to  these 
questions.  A  summary  of  the  book,  with  attention  to  these  issues, 
follows. 

The  book  consists  of  eight  chapters.  The  first  is  a  lucid 
introduction,  explaining,  very  briefly,  what  the  field  is  and  that, 
while  teachers'  expectations  from  SLA  must,  at  this  point,  be 
"modest"  (p.  3),  there  is  some  relation  to  language  teaching.  They 
take  an  appropriate  middle  ground,  saying  neither  that  SLA  research 
must  serve  only  to  benefit  language  teaching,  nor  that  those  ties 
should  be  severed  (see  Newmeyer  and  Weinberger,  1988  for  this 
latter  view). 
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The  second  chapter  discusses  research  methodology, 
including  characteristics  of  both  qualitative  and  quantitative  research, 
in  a  manner  accessible  to  new  students  of  SLA.  The  authors  are  fair 
to  both  sides,  showing  which  paradigms  can  be  used  for  which 
purposes.  Of  the  qualitative  and  quantitative  paradigms,  they  say, 
"Rather  than  seeing  them  as  competing  paradigms,  we  see  them  as 
complementary,  implying  that  it  is  unnecessary  to  choose  between 
the  two"  (p.  24).  They  also  discuss  different  types  of  data  collection 
without  advocating  one  over  the  other. 

The  third  chapter  provides  a  historical  view  of  methods  of 
analysis  in  the  field  of  SLA.  In  a  field  that  is  only  20  years  old,  it  is 
appropriate  to  provide  a  comprehensive  history,  particulariy  for 
students  to  see  how  the  field  evolved  and  to  keep  them  from 
repeating  past  errors.  In  keeping  with  trends  in  the  field,  they 
appropriately  criticize  contrastive  analysis,  error  analysis,  and 
morpheme  acquisition  studies.  They  say  that  discourse  analysis 
(very  broadly  defined)  has  subsumed  previous  methods  of  analysis. 
While  this  is  true  in  relation  to  the  other  methods  of  analysis 
discussed,  those  working  in  a  Universal  Grammar  (UG)  framework 
might  take  issue  with  this  characterization. 

Chapters  four,  five,  and  six  deal  comprehensively  with 
various  findings  about  interlanguage,  the  linguistic  environment 
(input  and  interaction),  and  explanations  for  differential  success 
among  SLA  learners. 

Chapter  seven  is  a  good  introduction  to  theories  and  theory 
construction.  The  authors  begin  by  comparing  the  set-of-laws  form^ 
and  the  causal-process  form,  clearly  showing  their  preference  for  the 
latter.  At  the  end  of  the  chapter,  they  say,  without  reference  to  any 
work,  that  not  all  in  the  field  share  their  views.  (For  opposition  to 
their  view  see  Klein,  1990  and  Markee,  1991.)  They  present  and 
critique  several  theories  of  SLA,  classifying  them  as  nativist, 
environmentalist,  or  interactionist.  Any  book  claiming  to  be  an 
introduction  to  the  field  cannot  ignore  the  fact  there  is  no  consensus 
on  SLA  theory  and  thus  Larsen-Freeman  and  Long  state  at  the  end 
of  the  chapter: 

The  rise  of  a  single  dominant  theory  which 
discourages  competing  points  of  view,  given  our  present 
limited  state  of  understanding,  would  be  counter-productive. 
We  must  guard  against  overzealousness  on  the  part  of  theorists 
or  their  devotees  who  feel  that  they  have  a  monopoly  on  the 
truth.  While  SLA  research  and  language  leaching  will  benefit 
from  the  advantages  of  theoretically  motivated  research  which 
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we  have  spelled  out  in  this  chapter,  it  would  be  dangerous  at 
this  stage  for  one  theory  to  become  omnipotent,  (p.  290) 

Even  Beretta  (1991),  who  argues  that  multiple  theories  are 
problematic  for  SLA,  states  that  it  is  not  necessary  "for  theory 
choice  to  be  made  now  "[emphasis  in  original]  (p.  507).  And  as  the 
choice  has  not  yet  been  made,  it  is  essential  to  provide  students  of 
SLA  with  all  possible  theories. 

The  book  ends  with  a  chapter  on  instructed  SLA,  showing 
that  the  authors  are  truly  concerned  with  the  relationship  between 
instruction  and  SLA.  Research  on,  for  example,  how  instruction 
does  or  does  not  affect  developmental  sequences  should  be  of 
interest  to  any  student  or  researcher  of  SLA,  not  only  to  language 
teachers. 

Despite  the  fact  that  the  authors'  biases  can  often  be  seen 
throughout  the  book,  they  clearly  try  to  present  all  sides  of  issues 
and  at  times  explicitly  state  their  biases.  Furthermore,  they  include 
work  which  has  become  part  of  SLA  that  they  themselves  have  not 
been  active  in  (e.g.,  UG,  connectionism).  Larsen- Freeman  and 
Long  admit  that  they  have  omitted  work  on  lexical  acquisition  and 
pragmatics.  Also  missing  is  much  reference  to  cognitive  theory 
including  issues  such  as  restructuring  and  automaticity. 
Furthermore,  Larsen-Freeman  and  Long,  with  one  notable 
exception,  do  not  overemphasize  issues  in  SLA  that  are  not 
mainstream.  The  exception  is  the  18  pages  devoted  to  the 
Multidimensional  Model  of  SLA.  Although  worth  including, 
particularly  because  of  its  potential  application  to  cross-linguistic 
SLA  research,  I  doubt  it  is  as  widely-cited  as  Larsen-Freeman  and 
Long's  book  might  suggest,  at  least  now.  (Other  SLA  textbooks 
(Ellis,  1986;  McLaughlin,  1987;  Gass  and  Selinker,  in  press)  give  it 
little  or  no  attention  at  all.)  Nevertheless,  this  book  is,  without  a 
doubt,the  most  comprehensive  review  of  SLA  research  to  date.  It  is 
extremely  dense,  but  in  a  classroom  setting  beginning  students  of 
SLA  with  a  background  in  linguistics  should  find  it  accessible. 

With  regard  to  the  book's  format,  at  the  end  of  each  chapter 
there  are  excellent  comprehension  and  application  activities.  The 
lack  of  an  author  index  is,  however,  extremely  frustrating.  Upon 
finding  an  interesting  reference  in  the  bibliography,  one  has  no  way 
to  find  out  where  in  the  book  an  author's  work  is  cited,  thus 
hindering  its  use  as  a  reference. 

Earlier  I  mentioned  the  authors'  responsibilities  to  present  a 
balanced  view  of  the  field.  While  one  may  not  agree  that  such  a 
responsibility  exists,  one  cannot  argue  with  the  fact  that  an 
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introductory  text  with  two  such  notable  authors  will  be  widely  used. 
I  believe  that  instructors  using  the  book  can  feel  confident  that  their 
students  will  have  a  balanced  mainstream  view  of  the  core  issues  in 
SLA. 
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Over  the  past  fifteen  years,  there  has  been  an  increased 
interest  in  the  cognitive  processes  which  account  for  how  the  learner 
of  a  second  language  (L2)  handles  conceptual  and  linguistic  input, 
how  this  learner  processes  this  information  to  allow  for  intake,  and 
how  the  newly-acquired  knowledge  is  used  to  produce  messages  in 
the  L2.  Second-language  learning  strategy  research  focuses  on  the 
processes  and  strategies  used  to  perceive,  internalize  and  automatize 
new  linguistic  input,  with  emphasis  on  language  learning. 
Bialystok's  work.  Communication  Strategies,  however,  differs 
from  other  books  on  strategy  research  (O'Malley  &  Chamot,  1990; 
Oxford,  1990;  Wenden,  1991)  in  that  it  takes  a  much  narrower 
focus,  concentrating  on  the  processes  and  strategies  a  learner 
invokes  when  declarative  and  procedural  knowledge  are  utilized  to 
communicate  a  message.  The  emphasis  of  this  book  is  on  language 
use  and  the  linguistic  and  cognitive  processes  involved  in 
communication.  Thus,  in  providing  an  in  depth  analysis  of  the 
processes  and  strategies  used  in  language  production. 
Communication  Strategies  provides  a  unique  contribution  to  learning 
strategy  research. 

Bialystok's  overall  goal  in  Communication  Strategies  is  "to 
find  a  means  of  explaining  how  strategies  function  in  the  speech  of 
L2  learners"  (p.  13).  The  book  contains  a  preface,  eight  chapters, 
notes,  references  and  an  index.  In  Chapter  1  Bialystok  finds  all  the 
definitions  of  communication  strategies  commonly  used  in  strategy 
research  to  be  ambiguous.  Bialystok  also  criticizes,  although  not 
explicitly,  the  undue  emphasis  in  strategy  research  on  definitions 
and  proposes  an  approach  to  investigating  communication  strategies 
which  fully  incorporates  the  identification,  explanation,  and 
instruction  of  communication  strategies.  The  remaining  chapters  of 
the  book  are  organized  around  these  three  points.  Chapters  2,  3, 
and  4  address  the  question  of  how  to  identify  and  categorize 
strategic  behavior  in  the  communication  of  L2  learners.  Chapters  5, 
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6,  and  7  explain  the  processes  involved  in  LI  and  L2  acquisition  and 
use  and  propose  a  framework  for  language  processing.  Finally, 
Chapter  8  discusses  the  pedagogical  issues  related  to  communication 
strategy  instruction. 

In  Chapter  1  Bialystok  criticizes  the  definitions  of 
communication  strategies  used  by  researchers  for  their  focus  on  the 
features  of  problematicity,  consciousness  and  intentionality.  In  her 
treatment  of  these  definitions,  she  highlights  their  ambiguous  nature 
such  as  the  fact  that  (1)  learners  use  communication  strategies  not 
only  in  problematic  situations,  but  in  non-problematic  situations  as 
well;  (2)  learners  might  or  might  not  use  these  strategies 
consciously;  and  (3)  these  strategies  could  be  invoked  with  any 
degree  of  intentionality  to  achieve  specific  communicative  goals. 
Instead,  Bialystok  recommends  that  we  investigate  the  strategy  use 
by  determining  a  means  to  identify  and  explain  strategic  behaviors 
and  by  assessing  the  teachability  of  these  strategies  for  purposes  of 
facilitating  more  effective  language  learning. 

Chapter  2  examines  different  ways  of  identifying  strategic 
behaviors  and  attempts  to  clarify  the  psychological  construct  of 
communication  strategy.  In  this  chapter,  Bialystok  situates 
communication  strategies  within  the  framework  of  language  use,  but 
unfortunately  makes  no  attempt  to  relate  language  use  to  a  more 
general  model  of  communicative  competence.  Rather,  she  briefly 
describes  a  hierarchical  structure  where  language  use  is  divided  into 
processes  and  strategies  and  where  strategies  are  further  subdivided 
into  communication  and  learning  strategies.  She  first  discusses  the 
distinction  between  strategies  and  processes  and  concludes  that 
"without  substantial  direction  in  how  to  proceed  with  a  distinction 
between  strategies  and  process  of  language  production,  the 
possibility  that  these  are  ultimately  not  different  events  remains 
tenable"  (p.  25).  She  then  attempts  to  differentiate  communication 
strategies  from  other  types  of  strategies  (e.g.,  production  strategies 
(Tarone,  1980),  learning  strategies  (Stern,  1983),  and  social 
strategies  (Wong  Fillmore,  1979).  Bialystok  reports  on  a  third 
attempt  to  identify  strategic  behaviors  in  communication  which 
arises  from  the  investigation  of  systematic  differences  among 
speakers  engaged  in  different  types  of  communication.  Here  the 
manipulation  of  messages  and  linguistic  forms  is  studied  to 
determine  to  what  extent  an  original  message  was  reduced,  deleted, 
altered,  or  avoided.  I  found  Bialystok's  attempt  to  delineate 
language  use  in  this  chapter  somewhat  ambiguous.  The  language 
use  hierarchy  upon  which  the  chapter  was  based  seems  to  be 
inspired  by  disparate  theoretical  arguments  explaining  the  construct 
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of  communication  strategies  and,  in  my  opinion,  the  chapter  raises 
more  questions  than  it  answers. 

In  Chapter  3  Bialystok  provides  a  comprehensive  summary  of 
the  major  taxonomies  used  to  classify  communication  strategies 
(Tarone,  1977;  Varadi,  1980;  Bialystok  &  Frohlich,  1980;  Faerch  & 
Kasper,  1983;  Paribakht,  1985).  Bialystok  notes  that  researchers 
seem  to  agree  on  the  communicative  behaviors  used  by  L2  learner 
but  asserts  that  no  single  specific  factor  seems  to  predict  the  use  of 
any  one  strategy.  The  potential  of  these  taxonomies  to  describe 
strategies  is  then  evaluated  in  Chapter  4.  Here,  the  author  reports  on 
a  study  of  the  strategic  behaviors  of  18  nine-year-old  English- 
speaking  children  learning  French.  Tarone's  taxonomy  was  used  in 
this  study  and  Bialystok  states  that  the  criteria  used  to  classify 
strategic  categories  were  sometimes  ambiguous  and  arbitrary.  In 
Chapters  3  and  4,  Bialystok  provides  a  convincing  and  insightful 
argument  illustrating  the  potential  shortcomings  of  existing 
taxonomies,  and  rightfully  concludes  that  instead  of  studying 
strategies  independently  through  definitions  or  taxonomies, 
strategies  should  be  analyzed  within  a  coherent  model  of  speech 
production. 

In  Chapter  5  Bialystok  surveys  the  research  on  child  and  adult 
strategy  use  in  LI  production  and  compares  this  to  adult  strategy  use 
in  L2  speech.  She  maintains  that  the  communication  problems  faced 
by  children  in  early  phases  of  acquisition  are  similar  to  those 
encountered  by  L2  learners  and  remarks  that  aside  from  the  adult's 
cognitive  conceptual  maturity  and  access  to  a  developed  lexicon  in 
another  language,  the  strategies  used  by  children  and  adults  are 
identical.  This  point,  however,  seems  debatable,  if  for  no  other 
reason  than  the  comparatively  greater  variety  of  strategies  used  by 
adults  and  the  flexibility  with  which  they  use  them.  Furthermore, 
this  assertion  contradicts  previous  work  by  Brown,  Bransford, 
Ferrara,  and  Campione  (1983)  who  claim  that  strategic  behavior 
develops  with  age.  Nonetheless,  this  point  presents  an  interesting 
line  of  future  research  to  pursue.  Finally,  Bialystok  adds  that  "there 
is  no  doubt  that  adults  use  these  strategies  more  effectively,  more 
efficiently  and  more  flexibly  than  children  do,  but  there  is  no 
evidence  that  the  strategies  themselves  are  any  different"  (p.  101). 

In  Chapter  6  Bialystok  reviews  two  studies  focusing  on 
children  and  adults'  use  of  an  L2.  These  studies,  she  claims,  differ 
from  previous  ones  in  that  "the  classification  [of  the  L2  utterances] 
is  based  on  distinctions  between  processes"  (p.  104).  This 
reference  to  "process,"  however,  is  the  source  of  considerable 
confusion  as  it  is  not  defined.  The  only  apparent  difference  between 
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these  studies  and  the  previous  ones  is  that  the  classifications  in  the 
current  studies  are  not  solely  generated  from  observable  utterances, 
but  are  structured  to  require  the  children  to  process  information  on  a 
metalinguistic  level  before  attempting  a  task.  For  example,  the  first 
study  investigates  the  ability  of  children  to  construct  formal 
definitions.  Snow  and  her  colleagues  (1989)  chose  this  task 
because  it  provided  a  "decontextualized  metalinguistic  use  of 
language"  as  a  process.  They  found  that  children  could  identify  and 
construct  formal  definitions  as  early  as  age  7.  The  second  study 
examined  how  adults  use  referential  strategies  in  both  their  LI  and 
their  L2.  The  classification  of  the  utterances  in  this  study  was 
organized  according  to  the  production  processes  speakers  use.  The 
taxonomy  emerging  from  this  study  consisted  of  conceptual  and 
linguistic  strategies.  The  conceptual  strategies  involved 
approximation  and  circumlocution  strategies,  while  the  linguistic 
strategies  involved  borrowing,  foreignization,  transliteration,  and 
word  coinages.  I  felt  that  Chapters  5  and  6  accurately  illustrated  the 
need  to  go  beyond  research  based  on  definitions  or  taxonomies  and 
demonstrated  the  explanatory  potential  of  communication  strategy 
research  based  on  a  model  of  language  production  as  well  as  on  the 
definitions  and  taxonomies. 

Bialystok  describes  her  theoretical  model  of  language 
acquisition  and  processing  in  Chapter  7.  In  this  framework 
language  proficiency  consists  of  two  components  of  language 
processing:  the  analysis  of  linguistic  knowledge  and  the  control  of 
linguistic  processing.  The  analysis  component  refers  to  how 
language  knowledge  is  represented  and  accessed,  while  the  control 
component  deals  with  the  executive  procedures  for  performance. 
Bialystok  applies  this  framework  to  communication  strategies, 
stating  that  the  analysis-based  strategy  allows  the  L2  learner  to 
examine  and  shape  intended  meaning,  while  the  control-based 
strategy  permits  the  speaker  to  focus  on  linguistic  form  or  some 
other  source  of  information.  I  felt  that  this  framework  clearly 
illustrates  the  dynamic  interaction  between  these  two  components 
because  it  reflects  the  ways  all  people  process  language  production 
when  communication  requires  extension  or  adaptation.  In  the  case 
of  children  or  L2  learners,  this  production  system  is  often  strained 
due  to  an  inability  to  adjust  to  the  communicative  event. 

Chapter  8  superficially  discusses  the  potential  value  of 
learning  and  teaching  communication  strategies.  Bialystok  presents 
a  strong  view  of  instruction  which  maintains  that  taxonomic  listings 
can  be  taught  explicitly.  In  other  words,  learners  can  be  taught  to 
paraphrase,  to  invent  new  words,  and  the  like.  She  also  discusses 
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the  moderate  view  which  states  that  strategies  can  be  presented  as 
devices  to  be  used  in  solving  communication  problems.  I  felt, 
however,  that  perhaps  a  more  realistic  approach  to  learning  and 
teaching  communication  strategies  would  involve  the  combination  of 
both  views  in  accordance  with  the  changing  situational  demands  of 
the  syllabus. 

In  sum,  despite  its  shortcomings.  Communication  Strategies 
is  an  inspiring  book  for  applied  linguists  who  wish  to  pursue 
research  in  learning  or  language  use  strategies.  It  provides  a  critical 
analysis  of  research  approaches  used  thus  far  in  investigating 
communication  strategies  and  proposes  new  avenues  for  further 
research  by  means  of  an  articulated  cognitive  component.  I  found 
this  book  to  be  challenging  in  places,  but  a  very  informative  read 
indeed.  Communication  Strategies  is  an  essential  source  for  those 
seeking  an  orientation  to  the  current  issues  in  learner  strategies. 
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Syntax:  Volume  II  is  the  second  book  in  Givon's  two- 
volume  morphological  and  syntactic  survey  of  language  from  a 
functional  perspective.  (For  a  review  of  the  first  volume,  see  Heath, 
1986)  As  a  functionally-oriented  grammarian,  Givon  concerns 
himself  not  with  formal  syntax  but  with  the  systematic  uses  to  which 
constructions  are  put.  Syntax  is  for  him  functional  in  a  strong 
sense:  the  form  of  language  is  claimed  to  be  a  direct  reflection  of 
users'  communicative  needs  at  all  levels  of  analysis.  While  the 
heavily  English-oriented  second  volume  may  be  read  independently 
of  the  first,  some  grounding  is  in  order.  For  Givon,  the  levels  of 
analysis  appropriate  to  syntax  are  the  discourse-pragmatic,  the 
propositional-semantic,  the  lexical-semantic  and  the  phrasal- 
semantic;  the  four  have  individual  requirements  which  occasionally 
conflict.  To  understand  syntax  is  to  understand  these  levels  and  the 
conflicts  among  them.  Knowledge  of  diachronic  change  is  also 
essential  to  a  proper  understanding  of  structure. 

Chapter  12,  the  opening  chapter,  deals  with  the  coherence  of 
noun  phrases  (NPs).  The  order  of  pre-  and  post-nominal  modifiers 
is  held  to  be  determined  on  a  scale  of  relevance  as  in  Bybee  (1985); 
there  is  a  partial  parallel  to  the  placement  of  complements  and 
adjuncts  in  formal  approaches.  Elements  of  NPs  tend  to  be 
contiguous  rather  than  scattered  through  a  clause  for  iconic  reasons, 
to  preserve  functional  unity.  Conjunction  of  NPs  is  limited  to  NPs 
of  equal  thematic  status  with  similar  case  roles.  "Separate  events 
will  tend  to  be  encoded  by  separate  clauses"  (p.  488);  a  fairly 
detailed  section  illustrates  the  "pragmatic-cognitive"  difficulties  of 
this  phenomenon.  In  a  section  on  nominalization  of  clauses,  a  scalar 
order  of  nominal-like  phrases  is  presented,  with/<9r-r6>  clauses  at  the 
bottom  and  the  enemy's  destruction  of  the  c/fy-type  nominals  at  the 
top.  Exactly  what  this  would  mean  in  syntactic  terms  (e.g.,  the 
inability  of  infinitives  to  serve  well  as  the  subjects  of  yes-no 
questions)  is  not  addressed. 

Chapter  13  deals  with  verbal  complementation,  investigating 
the  semantic  nature  of  the  relationship  between  main  and  embedded 
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clauses.  The  chapter  illustrates  what  are  seen  as  weaker  semantic 
bonds  with  cognition-utterance  ('know')  verbs  compared  to  the 
successively  tighter  bonds  with  manipulative  ('order')  and  temporal 
aspect  ('finish')  types.  The  relative  strength  of  bond  is  reflected  in 
the  syntax,  attesting  to  the  latter's  iconicity.  The  more  loosely 
bound  a  main  verb  is  to  its  dependent  clause,  the  more  likely  the 
presence  of  a  complementizer;  how  this  squares  with  missing  'that' 
in  English  bridge  clauses  (/  said  he's  here)  is  unclear. 

Chapter  14  covers  voice  and  detransitivization.  A  systematic 
comparison  is  made  of  active/passive/antipassive/reflexive  with 
regard  to  topicality,  case  marking,  promotion,  and  demotion.  The 
need  for  semantic  marking  of  passive  topics  conflicts  with  the 
requirement  for  pragmatic  marking,  creating  a  'functional  bind'  for 
the  morphology;  this  bind  is  treated  in  greater  detail  in  Volume  I. 

Most  types  of  relative  clauses  are  treated  in  Chapter  15. 
Restrictives  and  nonrestrictives  are  contrasted  with  respect  to  their 
position  in  the  higher  clause,  marking  of  verbal  elements,  use  of 
relative  pronouns,  presence  of  gaps  vs.  resumptives,  and  word 
order.  Sections  on  clausal  extraposition,  the  Complex  Noun  Phrase 
Constraint  (CNPC),  and  center  embedding  underscore  the  gulf 
between  the  formal  approaches  and  Givon's  functional  approach: 
the  CNPC  is  seen  here  as  a  limit  which  is  probably  based  on 
physical  distance  rather  than  on  constituent  structure  alone  (cf. 
Newmeyer's  (1983)  discussion  of  Givon  (1979)  on  this  point). 

The  next  two  chapters  treat  contrastive  focus  and  marked 
topic  constructions.  Much  space  is  devoted  to  illustrating  what  is 
presented  at  the  end  of  Chapter  16  as  the  "preposed  order  principle," 
according  to  which  less  predictable  but  more  important  information 
is  viewed  as  "more  likely  to  be  placed  earlier  in  the  clause"  when 
placed  in  the  context  of  precessing  and  memory  (p.  737). 
Exceptions  (such  as  pseudoclefted  NPs  and  determiner-noun  order) 
abound,  but  are  not  addressed  as  counterexamples.  In  a  natural 
continuation.  Chapter  17  deals  with  marked  topics  including  shifted 
datives  (viewed  as  topic  promotions)  and  raising  constructions. 
Raising  to  object,  now  a  relic  in  generative  theory,  is  argued  also  to 
be  a  case  of  topic  promotion.  Without  a  true  raising  rule,  however, 
the  chameleon-like  nature  of  belief-type  objects  could  be 
reinterpreted  as  a  case  of  closer  event  integration  of  the  lower  with 
the  higher  clause. 

Chapter  18  is  an  overly  cursory  treatment  of  non-declarative 
sentence  types.  These  types  are,  as  often  is  the  case  with  types  in 
Givon's  work,  placed  on  a  continuum  with  prototype  peaks. 
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The  lengthy  Chapter  19,  "Interclausal  Coherence,"  covers 
the  relations  between  adverbial  and  main  clauses,  coordinate 
clauses,  and  larger  discourse  units.  The  links  among  such  units  are 
argued  to  be  looser  than  those  between  main  and  complement 
clauses.  Semantic  evidence  is  a  greater  freedom  to  break  continuity 
links;  syntactic  evidence  lies  in  intonational  contours  and  the  ability 
of  adverb  clauses  to  prepose,  thereby  effecting  higher  topical  status. 
Participial  clauses  are  seen  as  more  or  less  integrated,  depending  on 
their  type.  Introducing  yet  another  scale,  Givon  redefines  "finite" 
as  a  complex  of  features  including  tense-aspect-modality, 
agreement,  and  other  (including  nominal)  affixes.  There  is  also  a 
long  section  on  clause-chaining  and  typology. 

Chapter  20,  "The  Grammar  of  Referential  Coherence:  A 
Cognitive  Reinterpretation,"  might  have  been  more  fitting  as  an 
opening  chapter  of  the  book.  Grammar  is  reinterpreted  "as  mental 
processing  instructions"  (p.  894)  and  Givon  promises  testable 
hypotheses  based  on  domains  outside  grammar,  though  we  are 
largely  left  without  a  clear  means  to  test  for  the  derivative  status  of 
grammar.  The  "mental  proposition"  is  the  basic  unit  of  stored 
information,  but  since  discourse  is  multi-propositional  and  shared, 
grounding  is  necessary.  New  propositions  are  viewed  as  a 
felicitous  combination  of  old  and  new  information,  with  the  former 
serving  to  ground  and  the  latter  serving  to  move  the  discourse  along. 
Grounding  is  based  in  grammatical  devices  which  code  referentiality 
and  definiteness;  thematic  coherence  across  clauses  is  established 
primarily  by  the  "grammar  of  topicality"  (the  establishment  of  topics 
using  mainly  nominal  arguments  as  signals).  "Coherent  discourse  is 
characterized  by  equi-topic  clause  chains"  (p.  902).  There  is  an 
attempt  to  underpin  the  notion  of  referential  coherence  in  cognitive 
terms;  definite  vs.  indefinite  NPs  and  lexical  vs.  pronominal  NPs 
are  reinterpreted  in  terms  of  "active  vs.  existing  memory  files"  and 
"short-  vs.  long-term  memory  searches." 

The  final  chapter  explores  the  concepts  of  markedness  and 
iconicity  in  syntax.  Markedness  is  seen  as  a  function  not  only  of 
relative  structural  complexity  and  frequency  but  also  of  cognitive 
complexity,  the  last  being  defined  in  terms  of  "attention,  mental 
effon  or  processing  time"  (p.  947).  Canonical  declaratives  are  seen 
as  the  unmarked  type;  the  prevalence  of  non-declarative 
manipulatives  in  early  child  speech  is  seen  as  an  evolutionary 
throwback  to  stages  when  such  clauses  were  unmarked. 
Markedness  scales  for  nouns  and  verbs  with  respect  to  affixes  and 
referentiality  are  a  carryover  from  Givon's  longer  treatments  in 
Volume  I. 
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Autonomous  syntax  is  here  repudiated  much  as  it  is  in  Givon 
1984.  Three  "iconic  coding  principles"  are  set,  two  of  them  cleariy 
morphosyntactic.  The  "quantity  principle"  gives  the  larger,  less 
predictable,  and  more  important  information  a  larger  "chunk  of 
code"  or  more  "coding  material"  (969).  How  the  terms  'important' 
and  'predictable'  are  operationally  defined  and  how  the  three  might 
be  collectively  measured  in  their  interaction  is  not  explained.  The 
"proximity  principle"  places  "functionally,  conceptually,  or 
cognitively"  similar  "entitites"  closer  together  in  the  sentence,  as 
evidenced  in  the  relative  syntactic  integration  of  two  clauses  (cf. 
Chapter  13)  and  the  relatively  non-scattered  nature  of  elements  of 
phrases  within  clauses  (cf.  Chapter  12).  The  "linear  order  principle" 
places  clauses  in  connected  discourse  in  sequential  order  in 
unmarked  cases.  What  is  non-iconic  in  syntax  is  held  to  combine 
with  the  iconic  so  as  to  "reinforce"  the  latter.  In  the  final  section, 
there  is  an  attempt  to  ground  iconicity  in  biology  with  arguments 
from  genetics  and  animal  communication. 

In  a  general  way,  the  sequencing  from  the  beginning  of  the 
first  volume  to  the  end  of  the  second  involves  a  movement  from 
smaller-scale  phenomena  (e.g.,  case  marking,  tense-aspect- 
modality,  agreement)  to  large  units  of  language  that  extend  beyond 
syntax  in  the  usual  sense.  As  the  sequence  proceeds,  the  case  for  a 
functional  approach  seems  to  grow  roughly  with  the  size  of  the  unit 
examined.  The  larger  the  unit,  the  more  the  syntactic  choices  for  the 
user  of  language,  and  where  genuine  options  exist,  the  investigator 
can  study  the  contexts  for  the  choices.  Givon  is  at  his  most 
plausible  in  reporting  the  high  likelihood  of  referential  continuity 
across  and-clsiusts  as  compared  to  ^wr-clauses  (Chapter  19)  or  on 
the  pragmatics  of  marked  topic  constructions,  where  a  systematic 
comparison  of  discourse  anaphors,  dislocated  and  other  NPs  is 
made  with  regard  to  referential  distance  from  like  material.  The 
repeated  references  to  the  fact  that  speakers  make  syntactic  choices 
rings  far  truer  here  than  in  Volume  I,  where  the  more  basic  elements 
of  syntax  discussed  are  simply  given  by  the  grammar;  one  can 
reasonably  choose  to  cleft,  but  one  cannot  dictate  the  form  a  cleft 
will  take.  Here  the  two  senses  of  functionalism  are  confused:  one 
is  the  functionalism  of  day-to-day  usage,  the  other  the  alleged 
functionalism  of  linguistic  evolution.  It  is  possible  to  accept  the  one 
without  completely  accepting  the  other. 

If  this  volume  (and  the  first)  purported  to  be  concerned  with 
the  discourse  functions  of  syntactic  constructions,  it  would  be  easier 
to  accept  on  the  whole.  The  fact  that  it  attempts  to  treat  syntax  as  an 
essentially  discourse-grounded  phenomenon  makes  it  harder  to 
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accept.  Formal  relations  are  a  ghost  in  Givon's  machine;  yet  the 
existence  of  formal,  less  plainly  iconic  phenomena  is  implicitly 
suggested  throughout  the  book.  If  She  is  believed  to  be  a  crook  is 
an  example  of  topic  promotion  from  the  embedded  clause,  then  *She 
is  believed  [  ]  is  a  crook  ought  also  to  be  possible;  it  is  not,  but  in 
this  framework  we  have  no  apparent  means  of  determining  why. 
The  fact  that  that  clauses  are  less  referentially  integrated  than 
infinitives  does  not  really  solve  the  problem  because  one  may 
topicalize  from  either  an  infinitive  or  a  complement  clause. 
Similarly,  Givon's  pervasive  use  of  scalarity  would  seem  to  predict 
more  of  a  continuum  of  word  order  possibihties  than  actually  exists; 
focusing  may  be  an  initial  {Him  I  dislike),  medial  {It's  him  I 
dislike),  or  final  (/  dislike  him,  but  I  like  her)  option,  but  the  medial 
*/  him  dislike  is  disallowed  for  unexplained  reasons.  Surely  such  a 
sentence  is  more  than  merely  pragmatically  inefficient.  In  the  first 
volume  constraints  are  said  to  exist  (1984:  36),  but  it  is  not  clear 
that  they  are  ever  explicitly  elaborated.  In  fact,  we  do  not  know 
exactly  what  they  are  constraints  on.  In  general,  we  do  not  learn 
how  functional  considerations  alone  can  really  predict  the  form  that  a 
given  language  will  take  or  be  prevented  from  taking. 

The  term  "functional  grammar"  means  different  things  to 
different  people.  (For  a  review  of  these  different  meanings,  see 
Tomlin  1990.)  If  researchers  seek  correlations  between  particular 
structural  types  and  particular  discourse  functions,  that  is  a  relatively 
modest  goal.  If  one  seeks  to  render  formal  syntactic  theory 
superfluous,  that  is  a  much  more  ambitious  goal,  and  one  which 
may  not  be  feasible  (cf.  Newmeyer  1983:  1 19ff.).  At  any  rate,  one 
needs  a  fully  developed  theory  of  discourse  requirements  which 
exists  independently  of  grammar,  as  well  as  a  method  of  mapping 
function  onto  structure,  and  a  means  of  accounting  for 
counterexamples,  in  order  to  successfully  derive  syntax  from  the 
theory.  An  attempt  to  show  that  a  specific  construction  is 
configured  in  a  particular  way  simply  because  that  is  the  most 
functional  way  for  it  to  be  configured  must  show  independently  that 
some  other  arrangement  would  not  be  equally  functional.  In  the 
second-to-last  chapter,  Givon  does  actually  begin  to  do  this  by 
linking  grammar  to  cognitive  principles,  but  the  discussion  remains 
quite  speculative. 

A  more  modest  goal  for  a  functional  grammar  is  worthy  in 
spite  of  being  modest.  It  is  worthwhile,  for  example,  for  second- 
language  teachers  to  know  more  than  simply  how  wh-clefts  are 
formed;  it  is  worthwhile  to  know  what  they  do,  and  for  this  purpose 
a  functional  complement  to  traditional  or  formal  grammar  is  useful. 
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This  complement  need  not  seek  a  discourse  function  for  every  aspect 
of  syntactic  structure;  even  if  the  search  were  successful,  it  is  not 
certain  what  role  some  facts  (e.g.,  "aspect  before  tense")  could  play 
in  teaching.  Many  of  the  areas  treated  in  Syntax:  Volume  II  do 
seem  relevant  in  just  this  respect,  in  particular  speech  acts, 
contrastive  focus,  interclausal  coherence,  and  generally  all  areas 
which  deal  with  topicality.  Overall,  however,  this  volume  does  not 
quite  fit  the  description  of  a  book  fulfilling  the  more  modest  goal. 
Perhaps  no  existing  book  does,  though  Dik  (1989)  and  Halliday 
(1985)  seem  to  come  closer. 
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